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I, FORMAL CAUSALITY AND THE LAW OF ERROR 


If one will study Hagen’s demonstration of the law of error as given 
by Mansfield Merriman in his little work on the Method of Least 
Squares, one will see not only a beautiful illustration of the law of 
chance but will also understand how all measured effects may be con- 
ceived of as the resultant action of a vast number of tiny causes 
practically infinite in number. One who thought superficially about 
the matter might come to the conclusion that chance groupings of 
infinitesimal causes can account for all phenomena. 

Let us first consider the philosophical implications that underly 
Hagen’s demonstration of the law of error. | 

Why do the shots at a target group about a mean? The only 
answer is that the marksman intends to hit this particular target. 
If anybody at all were to pick up a gun and shoot anywhere he pleased 
there would be no tendency for the shots to group about the bull’s 
eye of a definite target in one particular place. We might study the 
mechanism of rifles and the chemistry of powder exhaustively but 
nothing in the mechanics of the rifle nor the chemistry of powder 
explains the fundamental fact in the law of error: The values vary 
about a particular mean. The law of chance and the theory of prob- 
abilities account very nicely for the form of distribution, but the law 
of chance supposes a distribution to start with; it accounts for its 
form but it can never explain why the values are distributed about 
this mean rather than another, or why they are distributed at all 
rather than scattered without any disbtriution all over the universe. 

Over and above the efficient causality of the exploding gunpowder 
by which a bullet is shot with a given velocity and the instrumental 
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causality of the rifle by which the path of the bullet is given some 
direction, there is the formal causality of the will of the marksman to 
direct all the bullets so as to hit a particular target. 

Let us bear in mind the distinction between formal and efficient 
causality. In the case of shooting at a target we can distinguish: 

(a) The efficient causality due to the exploding gunpowder. 

(b) The instrumental causality due to the use of a rifle which gives 
the bullet some definite trajectory in some definite plane. 

(c) The formal causality which collocates all the shots around a 
definite mark. It is due to the intention of one particular marksman 
to hit one particular spot on one particular target. If anybody took 
up one and the same gun and shot anywhere he pleased, the efficient 
causality of each shot would be about the same, namely, a certain 
charge of gunpowder; the instrumental causality would be about the 
same, namely, one and the same gun; but there would be no formal 
cause and the shots would not be distributed according to the law of 
error. If two marksman shot at two different points on the target 
there would be two formal causes acting and a bimodal distribution 
would result. In general, whenever values are distributed according 
to the law of error we must admit the presence of a formal cause, for 
if we leave out of consideration the formal distributing cause the fact 
of distribution remains unexplained. 

And so we may say that if values are distributed according to the 
law of error a formal cause has been operating in their production. 
When now we come to measurements in the organic and mental world, 
we find that in general or at least with great frequency they are 
distributed according to the law of error. If that is the case, then 
formal causes are operating in the organic and mental world. Every 
attempt, therefore, to explain the organic and mental world solely 
by the efficient causality of various forms of energy and the instru- 
mental causality of various mechanisms is bound to be inadequate 
because it leaves unaccounted for the very important fact that the 
phenomena are grouped; the groupings themselves must be explained 
and they demand for their explanation the activity of formal causes. 


Il. THE ULTIMATE ELEMENTS OF CAUSAL ACTIVITIES AND 
THEIR RELATIONS 


(a) A monistic hypothesis is excluded. 
Do we live in a monistic world in which all phenomena are ulti- 
mately to be explained as the combination of primordial elements, 
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all of essentially the same nature so that one thing differs from another 
only in as much as it contains a greater or lesser number of these 
primordial elements? 

It is difficult to see how this would be unless formal causes inter- 
vened to arrange the primordial elements in groupings that would, 
as groups, differ from each other. 

Let us, however, make the supposition that all causality may be 
reduced to a vast number of primordial causal elements that present 
no qualitative but only quantitative differences and are ungrouped 
by formal causality. The trait character or ability X for individual a 
may be thus expressed: 


To = 1, + O2,+ °° * Om, = Ma, 
where @, is the average of the a values. For individual b 
= a1, + a2, + ye Am, _ Moan 


We thus have a series of values which may be expressed in this manner: 
x = ma, where m is a constant and z and a are variables. 

Similarly for the trait y we have a series of values: y = na. 

The correlation between these values is unity. Therefore, under 
this supposition, the correlation of all traits and characters would 
approach unity within the limits of experimental error. Thus my 
colleague, Dr. J. Edward Rauth, informs me that he found all inter- 
correlations unity between the boiling-point, molecular weight, index of 
refraction and reciprocal of the specific gravity in a series of homolo- 
gous compounds, the monobromide compounds of the methane series. 
We know, however, that this is not universally the case, but that corre- 
lations between traits, if we include character traits as well as cognitive, 
vary from zero to plus or minus unity. 

(b) A pluralistic hypothesis is possible. 

Let us pass on now to the consideration of a plurality of primal 
causal factors. 

All the native traits of an individual of today are due to hereditary 
and environmental causes which go back to previous ages. As we go 
back the number of parents increases, although due to overlapping 
the increase is not simply 2". The causes of the environmental 
conditions of today are due to those of yesterday and these increase in 
number as we go back in years. 
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For practical purposes we may regard the parents of twenty or 
thirty generations ago as unrelated and independent causes. The 
same may be said of the ultimate environmental causes of the nth 
age in the past. | 

Ultimately we must go back to the primordial origin of things or 
some state in an age long past out of which the individual of the present 
day has been derived. 

The ability, therefore, of the individual of today may be regarded 
as due to a very great number of elementary primordial causes, each 
contributing a very small fraction to the individual’s ability and each 
primordial cause uncorrelated with every other primordial cause. 

Each one of the million or so individuals some twenty generations 
ago who contributed to the individual of today is a tiny contributing 
cause in his production. Allowing for overlap in the ancestry there 
were many thousands of different individuals many generations ago 
who contributed to the individual of today. Each of these thousands 
of parents differed from every other parent.. Besides, these individ- 
uals, their progeny and the individual of today have been subjected 
to all manner of environmental causes, some toxic and affecting the 
germ plasm, some producing social changes, some affecting food, 
clothing, shelter and education. As a result of these many causes 
differing in nature we have the individual of today with his organic 
structure, modified by its own environment and the personal endeavor 
of the individual. 

What were the ultimate causes in the primordial nebula, group of 
organisms or whatever it was from which the present order has been 
derived? Fortunately we do not have to know, but we can explain 
many things in the correlational analysis of mental life by assuming 
that whatever is measured by any technique of measurement is due to 
a vast number of different primordial causes, e.g. the parents of n 
generations ago, each contributing a tiny fraction to the thing that 
is now measured and each uncorrelated with every other such cause. 
We may write therefore 


X=artaztast-::* am 


X is the thing now measured, the a values are the tiny effects con- 
tributed by the primordial uncorrelated causes, and m is the number 
of such causes which must be regarded as very vast and practically 
(but not really) infinite. 
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Let us now for the sake of simplicity assume Y as constituted by 
the same causes that constitute X but fewer in number, so that 


Y=a:,+a:+as+°:*-:a,whenm>n 


Each a cause has affected not only one individual but also thou- 
sands of individuals at the present day. What can be said about the 
variation of the effects of primordial causes? Does the variation of 
one primordial cause differ from that of another? One group of 
boulders might vary much more than another, but if we ground them 
down to dust the standard deviation of the weights of the particles in 
one sample of ultimate dust would tend to equal that of another. 
For the sake of simplicity, however, we may assume that the standard 
deviations of all a values are identical. 

With these assumptions the correlation between any two variables 
is dependent solely upon the number of identical elements involved. 

For the sake of simplicity let us first assume that m = 4 and n = 2; 
that the above equations between X and Y are expressed as standard 
measures and that, as we have pointed out above, r.,., = 0. Then 


vm (a1 + a2 + a3 + a4)(a1 + a) : ; 
Tn = : t 
Ve jtatatadlt Satay Viv? v? 














50 1 
If m = 100 and n = 50, rz, = = 
~  4/1000/50 V2 
In general where m > n we have 
Tey = > = l = vn 
Vinvin M vim 
| n 


The correlation between X and Y, therefore, under the assumptions 
we have made depends solely on the ratio of the number of causes in X 
to that in Y (all the causes in Y being contained in X). 

If the total causality of the larger be regarded as unity, then the 
percentage identity (a) plus non-identity (b) is equal to 


r a b n a 
| ney aw 5 Gale a en 
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T sy? + 1? (2-y) =1 but a+b = 1 sy) 
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The correlation rx2-» is usually spoken of as the coefficient of 
alienation. 








It should be noted that 
z= ai + ae + a3 + 
Y = ai + ae 
z—-y= as + a 
_vn _ |m—n 
zy ~/m x(z—y) «/m 
and 


a + ?? 4 2—y) =] 


This particular notation holds, of course, under the particular assump- 
tion that all the causes in y are found in z. 

With these assumptions we have arrived at a result obtained by 
Tryon.! 

If however, as is often the case, X contains a series of elements and 
Y a series of elements, and some are contained in both, and others 
are contained in one but not in the other, the derivation of the per- 
centage of overlap in the causal factors is not given by the above 
equation. 


Let 21 = a1 + ae + a3 + a 


Let z2 = as + ag + as + ae 
Then? 
T = = = 1 = c 
me 4/4 2 Vav/d 
T2422, > ve . ve = TezTczs 
V/a Vb 
where c = number of common elements 


a = number of elements in 2; 
b = number of elements in z. 


Tex 
—- = Tex,” Tex.” + Tz (2-07 =1 


Tex,” 





1 Tryon, Robert Choate: ‘“‘The Interpretation of the Correlation Coefficient.” 
Psychol. Review, Vol. XXXVI, 1929, pp. 419-445 (p. 421). 


‘ c 
2 The expression eal reduces to our former one 4/ when c =a =n 
a- ™m 


and b = m. 
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Therefore! 


Ts.2 


Tez, . 





+ Te (2,-<)" = | 


In order to obtain the proportion of causes that overlap in xz, and 
zt, we must know rz or rez, or their underlying proportions. Our 
method thus gives us very simply a result obtained by another author 
using a different method. 

We thus see that by assuming a plurality of uncorrelated causes 
we can account for all positive correlations between variables by the 


percentage overlap of primordial causes common to two series of things 
measured. 


III. THE COMPOSITION OF MEASURES AND THEIR INTERCORRELATION 


Every attempt at measurement is an effort to measure something 
that apparently, if not really, has a definite specific character such as 
height, weight, power of forming associations, attention, etc. The 
measure made use of may at times unavoidably give a reading which 
is dependent on other factors besides the particular thing that one 
attempts to measure. In the terms of ultimate causes the thing one 
actually measures depends on a number of factors which contribute 
to the thing one wants to measure, plus a number of other factors which 
have nothing to do with it, so if we designate the measure actually 
obtained by zx; we have 


oi Seal te i te os ee ies 


Here the a values contribute to the desired measure while the v 
values have nothing to do with it. The a values affect shots at the 
same target; the » values affect shots at some other target or pure 
random shots. According to our previous considerations, the a 
values are influenced by one and the same formal cause, the v values 
by one or more different formal causes or perhaps by none at all. 

The a@ values are the ultimate Az values whose summation in 
Hagen’s demonstration gives rise to the law of error. They are 


therefore summated in each individual thing measured, so that we 
can write 


MmM=mMmatntwet::: Vn. 





1 This again is a result obtained by Tryon: Loc. cit., p. 429. 
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Similarly we may write for another measurement of the same 
trait, character or property 


Ze = Mat eituet:s** Mn 


From these equations we have for the correlation of sums 





, = MMe 
re me + nV m2? + Ne 


It is important to bear in mind the effect of the formal cause in 
summating unlike elementary causes to produce one and the same 
unitary effect. 

Thus the exact location of a mark made by one shooting at a target 
might be influenced by 

(a) Air currents blowing in various directions in the path of the 
bullet. The ultimate causes of the air currents during a particular 
shot would be multiple and various. 

(b) Hand tremors which in turn would be due to many and various 
causes different from one another and different from those that 
were responsible for the air currents. 

(c) Previous practice, again ultimately due to many and various 
causes. 

(d) Hereditary aptitude, again due to many causes. 

All these causes find one final expression, the distance of a certain 
mark from the center of the target. 

That all the shots group about this particular center and each 
particular shot approximates this center more or less perfectly is due to 
the will of the marksman to hit the center of the target, that is to say, 
to a formal cause. Air currents, hand tremors, previous practice, 
hereditary influences, though very different in themselves are sum- 
mated in virtue of a formal cause to distances on a target from the 
point where the shot hits to the center of the bull’s-eye.' 

It is easily seen that the tetrad function may be derived from a 
series of such values for Te.” 

What is the correlation between the alpha values in the test and the 
total sum of elementary causes that the test embraces? Let us term 
the alpha values g;. 








1 Cf. also above, page 407. 
2 Note that if some of the » values were identical with » values there would be 
a group factor present and the tetrad function would not hold. 





fc 


.- a 


reé 


te 


al 








SS 








Formal Causality and the Analysis of Mental Life 409 


This is given very simply by the expression 


™, 
mM, + ny 


T 2,0, 


for all the alpha values are contained in the m,; + n; ultimate causes. 
Correlations with g calculated from formal set-ups give a lower value 
by this formula than when calculated by Spearman’s formula for a 
whole table. 

What is the relationship between the a values in one measure of a 
trait and those in another? 

If we could suppose that every measure of a trait involves all the 
causes that concur to produce that trait, then the correlation between 
the a values in one test and those in another would be unity. 

Or if we could suppose that one measure certainly involved the 
same ultimate causes as another, then the correlation between the a 
values in one test and those in another would be unity. 

But neither supposition is likely, particularly if the measures 
represent independent avenues of approach. 

It is, therefore, likely that the correlation of the a values in one 
test with those in another will always fall short of unity. 

The value of the correlation of a values in one test with those in 
another is given by the expression 





Cc 
Too = —— 
719s V mi Me 


where c is the number of ultimate elements common to the two groups. 

It is evident that our various equations involving m, n and c¢ are 
for purposes of interpretation only and not for calculation, for the abso- 
lute values of the number of causes involved must remain unknown. 
Correlation coefficients, depending on the relative numbers of the 
ultimate causes, are calculable and with these we must work. 

Spearman has given a formula! for the correlation of a specific 
variable with the underlying general factor of a table of correlations. 
If the considerations we have here urged are valid, the correlation so 
obtained must be between a specific variable and an averaged general 
factor derived from the whole table. 





1 Abilities of Man, 1927, appendix, p. xvi. 
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The same is to be said of the formula we have published for the 
correlation between general factors,! the correlation obtained is 
between two averaged general factors. 

This will be brought out more clearly by looking at what we do 
when we obtain the weights for the multiple regression equations and 
work out the correlation between general factors. 

All weights and correlations are in reality averages from N equa- 
tions derived from the N individuals in the population studied. 

No individual has been acted upon by exactly the same number of 
ultimate causes as any other individual. This fact means that in 
measuring his height, weight, memory, etc., neither the a values nor 
the v values are the same for any two individuals. This fact gives rise 
to individual variation which is the basis of sampling error. 

The fact that no single test nor the group of tests used involves 
all the ultimate causal factors which are configured by a formal cause 
gives rise to errors of estimate in the multiple regression equation. 

Furthermore, the sampling errors for one test are not the same as for 
another test even though we have the same population, e.g., individ- 
uals who might be subjected to very similar causes leading to their 
weight might have been subjected to very different causes as far as 
their memory is concerned. 

Now in obtaining the weights for the regression equations a series 
of individual equations are summed,? and the weights so obtained are 
dependent on the average participation of the individuals in the a 
values or ultimate primordial causes that are configured by the formal 
cause. 

Furthermore the intercorrelations of all the tests involved are 
employed in the determination of the weights so that the resultant 
estimate of the individual’s participation in the common factor is a 
result of a complicated process of averaging. If one formal cause 
underlies the whole series of tests, then the individual’s participation 
in the elements configured by that cause is measured far better by 
the multiple regression estimate than it could be by any single test. 

The formal set-up in Table I indicates what is meant by the group- 
ing due to a formal cause. It will be noticed that only as, a6, a7, as 





1‘*Multiple Correlation and the Correlation between General Factors.” 
Studies in Psychology and Psychiatry, Vol. III, No. 1, 1931. For correcting this 
formula for errors of observation and estimate see Loc. cit., Vol. III, No. 3, 1933, 
p. 114. 

2 Studies in Psychology and Psychiatry, Vol. III, No. 1, 1931, p. 1. 
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are common to all four variables. Were the a values ungrouped these 
values and these alone would constitute the general factor. The 
general factor in one group would be identical with that in the other, 
but some of the other a values would constitute special bonds and the 
tetrad function would not hold. The table shows how various values 
are calculated according to the formulae given above. 

The problem now arises whether or not results obtained in various 
groups of experimental data conform to the above theoretical analysis. 

If a general factor in one group is identical with that in another 
group then the correlation between them should be unity. If general 


factors vary in their composition their correlation must fall short of 
unity. 


IV. THE UNITY AND IDENTITY OF COMMON FACTORS 


(t) A formula for the correlation of general factors. 

In Studies in Psychology and Psychiatry (December, 1931) we 
published a formula for the correlation of general factors. The for- 
mula was derived from the multiplication of two regression equations, 
written in standard measures, and the division of this product by the 
product of the two standard deviations of the estimated values. 


+ ye — MG — M;) 


G05 





Thus Tz5 = 


TABLE I.—SHOWING AN ARBITRARY GROUPING OF a VALUES AND How THE TETRAD 
Function Ho.tps OwInG TO THE OPERATION OF A FORMAL CAUSE 





a values grouped by a Values not affected by the 
formal cause formal cause of the a values Tag 
= SO OO ..._T eee. 
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Zi = 8a + + ve + vg + og + vs tit, FE 
Ze = Ta + wi + we + ss 5 
t%3 = 5a + 1 + W2 + Ts + 4 T0292, = Vi0v9 5270 
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2m a +a + ae : amet 
Ox, 4 oz, 
g=8 +6, +--- 6, 
Oy, Oy, Un 1 
a 
Here x1... 2%, and yi... y, are deviations from the mean. 
Therefore M, equals zero, for every sum 21, t2 . . . 2, is always zero. of 


The standard deviation of the estimated value of the multiple 
regression equation written as above in standard measures is the 1 
multiple correlation.’ N 
Multiplying, therefore, the above two equations for each individual 
of the population, summing for the entire population and dividing 
by N, we have 


= 2%Bel nq 
Ts2T gy, 


th 
In this equation r,,, and Tzy, designate the respective multiple correla- 


tions. 
We shall hereafter for the sake of simplicity designate by z and y, 
: - 





, ¥, we shall designate by x, and y. 
Tz, Ty, po 
We desire to approximate the true r.,,, from the correlation of the 


estimated values. Tx 

There are four types of error to which the correlation r., is subject: 

(a) Irrelevant factors such as bias, defect of attention, age, etc. 
These must be eliminated by the original set-up of conditions or by the 
technique of partial correlation. 

(6) Sampling errors. These are to some extent eliminated by g0 
the summation Za,8,7p,, but this elimination only approximates 
completeness. 

(c) Accidental errors of observation. 


= 
— and g. and the true values 
os oF 


Tl 
an 


(d) Errors of estimate due to the fact that the multiple correlations if 
involved fall short of unity. de 
We must leave the errors of sampling with the approximate elimina- 
tion due to the fact that the correlations r,, are sometimes above and th 


sometimes below the true values. It is very unlikely that all will be 
above or below the true values. 
We may derive in a simple manner our regression equation. 


xz = kz, 





1 See Studies in Psychology and Psychiatry, Vol. III, No. 1, 1931, p. 22. 
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Multiplying this equation by xz, we have 
zz, = kz,? 


If now we sum for the whole population and divide by N, we have 
K = r,z, = the multiple correlation, where K is the average value of 
all the individual k values. (Let it be remembered that for the sake 


of simplicity we have taken z = = and t= *°. Therefore 


1G) ~") | 


We may now write one regression equation 


XL ¥ LPez, and similarly 
Y = YT, 


If we introduce e; and e, for the experimental errors and ¢ and v for 
the errors of estimate, we have the equations 


L= Fez, tei tt 
Y = Yow, + 2 + v 


If we multiply out these equations and sum, etc., for the entire 
population, we have 


Tey = ley7 22.1 yy. + T ty Tuy. + Tvz,T zz, + Tt + Tew yye + Teyz.! zz, + Tew 


+ Test + Teres 


The experimental errors are uncorrelated with themselves, with z, 
and y., and with the errors of estimate ¢ and v, assuming no bias and 
good experimental technique. 





If now we neglect also the errors of estimate, rz, = : a : 
ZZo' Veo 


This value is evidently too high, for it would be the true value only 
if the multiple correlations were unity. Other considerations also 
demonstrate this. 

The question now arises: what error is involved in the neglect of the 
three terms rwy,Tyy, + Tvz,% zz, + Tr, and how may we correct for it? 


tyory, = ty —tvfromt(y = Yory, + v) 
U%oP2z, = vx — tvfrom v(x = Xrez, + t) 
ty = rY — YXrzz, from y(X = Xrzz, + t) 
ve = LY — LYoSyy, from r(y = Yoryy, + V) 
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We therefore have 



























































LY = LYN sz0 yy, + (LY — YL ez.) + (ry — LYory,) — iv 
— ty = —ry + YL zz, + LYS yy, — LoYol ez.7 yy. tic 
Substituting, solving for z,.y., summing for the whole population 
and dividing by N, we have 
Trove need Tss.7 vy. oom rea) 
That is to say, our approximation neglecting the errors ¢ and »v is 
true minus the error involved in the neglect. In other words, we have thi 
derived no new relationship from our substitutions, but we are in a inc 
ag: 
position to approximate the correction — Try, a 
ZZo' YVe 
The situation may be objectified by the following simple diagram. , 
) 
— T zeve Tey Nc 
Tzzl yyo | an 
est 
— uD —> << vD — sar 
— D —> it i 
Theoretically the upper limit of “= is infinity as rzz,ryy, approaches on 
ZZo' Yo 
zero. But under the conditions of the problem r,, approaches zero 
at the same time, so that the expression becomes indeterminate, as it a 
should, when the multiple correlations approach zero. Practically 
cases often occur and some will be given where is well above 
Zo! YVo . 
unity. ; 
The upper limit of rz.,, is unity and (confining ourselves to positive s 
correlations) its lower limit is zero. § 
, .& 
Let = —_— Tey = D 
Tz2lyyo 
™ . 
PanT en ZoVo a 
Tezl yye 7 a 
nee ‘vy anduyt+v=1 a: 
zv _— 
a 112 
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If res.7yy. = 1, then 2 





= Tsy, and uw = 0. 
Tzz0l yye ey 6 


So that the relative size of u and »v varies with the multiple correla- 
tion which can vary between zero and unity. 

Let us assume that uw = rez,yy.- 

If this is true then by solving the equation for u we have 


Try Tay 
‘sv = ane — Tsy 2s." yy. 
— Trz.l yye (-28- «) = 








It will be noticed that this expression must approximate the above 
theoretical value. In the parenthesis r,, takes the place of r..,,, thus 
increasing the value of the expression in parentheses, but this value is 
again reduced by multiplying by rzz,ry, If rz is negative the signs 
automatically take care of the correction.’ 

To test our formula in practice we took the table of intercorrelations 
of character traits in Studies in Psychology and Psychiatry, Vol. II, 
No. 4, pp. 164-165; we selected for example alternate tests of ‘‘ will’’ 
and correlated the two estimated values of will, etc. Since they are 
estimates of the same thing the correlation should be unity; since the 
sampling error for one measure is not the same as that for another 
it is to be expected that the values will fluctuate above and below unity. 




















Taste II 
" vay : Tay Pedte Number of 
~_— aa — _ Peed ote calculated variables 
True value = 1 
.8293 .8274 |.686183 |.859759)1.252995 .983171 2 
.7589 .7273 |.551948 |.774336/1.402915 1.055972 
.8408 .3359 |.282425 |.351965)1.246214 . 993663 2 
.8409 .8393 |.705767 |.814266)1.153732 .914148 2 
.877257 |.877588) .769870 | .941628)1 .223100 1.006403 3 
.879807 | .877565)| .772088 |.949814/1.230189 1.014078 4 
Average = .994572 
Observed 
value = .576 
.617180 |.781792|.482506 |.357571| .741071 556060 4 


























1Cf. hereon also Studies in Psychology and Psychiatry, Vol. III, No. 3, pp. 
112-114. 
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A similar test was made for the correction of two anthropological 
measures. : 

The results are summarized in the accompanying table (II), 
which shows that the formula gives a fair measure of correction even 
with a few variables. 


The true correlation lies above r,, and below "v_ and the 


T2271 yy. 
formula for correction locates the true value between these limits with 
an approximation that increases with the multiple correlations involved. 

(it) Application of the formula in the study of the relationship of 
general factors. 

The accompanying table of correlations (Table III) gives us the 
intercorrelations of the ten tests comprising the Otis Group Intelli- 
gence Scale. They have been divided into two groups, the odd num- 
bered tests and the even numbered tests.! 





Tasie III.—SxHowine tHe DistrRisvuTION oF THE Opp AND Even NuMBERED 
Trests OF THE Otis Group INTELLIGENCE SCALE INTO Two Groups WITH 
A GENERAL Factor 1n Eacu, TOGETHER WITH THE WEIGHTS OF Eacu 
TrEst AND Its CORRELATION WITH THE GENERAL Factors 





S 
“> 


8 


Weights......|....|....]....|. 314). 128) .144) .384) .228) .222) .357/ .337| . 162) .126 
Odd g........]....|....|. 743} .665) .419) .517/. 723) . 583) .488) .607) . 590) .393) . 405 
Even g....... . 743} . .515 333] 513 654) .495) .601) .691) .673) .482) .422 


.314| .665) 515)... ane .407| .456| .354).274) .446) .485) .222) . 182 
.128 419) 3 - 280} .. . .|. 172) .263) .337].284) .355) . 133) . 129) .246 
144) 517) .513) .407| .172).. . .|.439) .238].314) .364) .467| .305) .271 
. 384) .723) .654) .456) 263) .439) .. . .| .432).462) .512) . 505) .396) .339 
. 228) . 583) . 49 "354| 337] 2381 432 .. +. |-336) .396) . 348) .254) . 406 


. 222) .488) .601) .274) .284) .314) 462). .. ».|-476} 404) . 277) .219 
. 357 .607) .691) . 446) .355) .364) 512) .396 .476) .. . .|.415) .276) .346 
.337 590) .673 485) . 133) . 467) . 505) .348) .404) 415) . . 397) . 266 
2 
40 


1;3;5;7],9)2)4{]6)] 8 | 10 


—— 
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-162| .393} .482] 222) 129] .305| .396| 254! .277|.276 3971... .|.209 
126] . 405] .422] 182] .246| .271| 339] .406] .219| 346] 266] .209 
| | | | | 





oMOrnD ON AW 


— 












































There is a general factor in the whole table, there is a general factor 
in the odd numbered tests and there is also a general factor in the even 
numbered tests. Is the general factor in the odd numbered tests the 





1 The correlations have been taken from the work of Cairns, Rev. George J. 
‘An Analytical Study of Mathematical Abilities.” Catholic University of America 
Educational Research Monographs, Vol. VI, 1931, p. 3. 
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TasLe IV.—Terrap DIFrerENcEs 


Opp Group Even Group 
.015879 .077468 
.044488 .074017 
.028609 — .003451 

— .070519 — .013168 
.005752 .035731 
.076271 .048899 

— .032712 .003642 
.027858 .039040 
.060570 .035398 
.067296 .010754 
.020418 — .002507 

— .046878 — .013261 
.011710 .013319 

— .073639 — .050627 

— .085349 — .063946 

PE = .037336 PE = .036263 


same as that in the even numbered tests? If it is, then the correlation 
between the two should be unity. 
What now is the correlation in question? The data are: 


Tey = .743158 
TezTyy, = -142792 
From the formula 


T zy 





Toy = 
“ Toncl ye 


we obtain rz, = 1.000493. This would be very good evidence of 
unit correlation were it not that we have shown that this formula 
over-corrects and the true correlation lies between this value and 
.743158. Applying our formula for correction (page 412), we find that 
the correlation between the two g-factors must be in the neighborhood 
of .809347. 

Let us now see what happens when we drop one of the tests from 
each group. 


Let us first drop the test that has the smallest saturation with g. 
The data now are: 


Tr = .726763 
T2z7yy, = .736563 


By our formula the true value of the correlation between the two 
general factors must lie between .726763 and .986695 and in the 
neighborhood of .795239. 








tt ne te 








Sy 


DN ee OS 2 Ae * 


F ex 
Ri a ath 


; 


a eta 2) eee D 





erga a? aS 


-, 
¥ 
2 


418 The Journal of Educational Psychology 


Let us now drop the test which has the highest saturation with g 
and our data become 


Try = .604804 
TezJyw,. = .667899 


By our formula the true value of the correlation between the two 
general factors must lie between .905532 and .604804 and in the 
neighborhood of .704676. 

The apparent meaning of these results is that the general factor 
in the odd numbered tests is not the same as that in the even numbered 
tests and that it must be compounded in some way. 

Similarly one can form, although with some difficulty in selection, 
two groups from the table given in Spearman’s Abilities of Man 
(p. 145) which satisfy the tetrad criterion and which give no values 
above unity when one applies the formula for the correlation of the 
specifics with the general factor. The data involved are as follows: 





Tr = .802840 
T2271 yy, = -812119 

Tev_ = 988574 
Teel yy 


We know that the correlation of the two general factors cannot be 
above .988574, that it lies between .802840 and .988574 and by the 
above formula for correction it should lie in the region of .837736. 

In like manner we can obtain the correlation between the g-factor 
in Stephenson’s verbal subtests and that in his non-verbal subtests.' 
These correlations are particularly valuable inasmuch as they are 
based on 1037 cases and are probably fairly free from sampling error. 
We used the data he gave for the non-verbal tests I, III, V, VIII and 
the verbal tests 2, 4, 6, 8. 

In this way we obtain the following data: 


Ty = 620421 
rez.Tyy, = .774431 


By our formula the true correlation should lie between .801131 and 
.620421 and in the neighborhood of .661184. 

It would seem that in this case also there must be some difference 
between the g-factor in the verbal and that in the non-verbal tests. 
If this be the case the g-factors are not simple but compound. 


1 Journal of Educational Psychology, Vol. XXII, No. 5, 1931, pp. 334-350. 
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From this study of the correlation of general factors it seems that 
the general factor in a group of tests is not precisely the same as that 
in a closely related group of tests. Though there is a general factor 
in a whole table of tests, when we split it arbitrarily into two tables of 
tests the general factor in one of these tables is somewhat different 
from that in another. 

How is all this to be interpreted? A fairly simple interpretation 
would be as follows. Whenever a group of variables shows significant 
intercorrelations these variables are all connected by a common matrix 
of causal relations. This common matrix of causal relations is the 
general factor. The causal factors in each variable not found in the 
common matriz constitute its specific factor. 

When the tetrad function holds there are no significant causal 
relations between two variables other than those contained in the 
common matrix. 

Whenever we add one or more variables to a group or subtract one 
or more from it, we are very likely to alter significantly the common 
matrix.' The g-factor in one group of variables is therefore seldom 
if ever precisely the same as that in another group of variables. 

This does not mean that the composition of g can be anything we 
please. We are limited by the measures available and the structure 
of that which is measured, namely, the human mind. The inter- 
correlation of general factors shows us that the causal background 
which lies back of the structure of the human mind is not manifested 
in the same way by all types of performances. This causal back- 
ground may be compared to a fabric which is not unfigured but figured. 
The figure in one region is not the same as that in another and as far 
as we can learn by tests the figures shade into each other rather than 
come suddenly to an abrupt end. 

The nature and meaning of the patterns is not revealed to us by 
mere statistical procedure, but only the magnitude of the differences 
between them. A descriptive definition, however, can be formulated 
by a study of our measures, even as a chemist describes an element by 
giving its specific gravity, melting point, etc. The task in statistical 
analysis must therefore be to find as many groups of tests as possible 
in each one of which there is a common factor, but the tests must be 





1 This view has been expressed by Wilson, Edwin B.: ‘‘Comment on Professor 
Spearman’s Note.” Journal of Educational Psychology, Vol. XX, 1929, pp. 
217-223. 
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chosen so that the common factors themselves show minimum 
intercorrelations. 


SUMMARY 


From a study of Hagen’s demonstration of the law of error it was 
pointed out that efficient and instrumental causality explain individual 
effects and that the law of chance explains the form of distribution in 
accordance with the law of error; but the original fact that the effects 
are distributed about a definite mean is not explained by efficient and 
instrumental causality, nor by the law of chance, nor by all these 
factors combined, and can only be explained by a formal cause which 
directs the activity of efficient causes to a definite end. 

It was pointed out that in biology and psychology measurements 
are very often distributed about a mean in accordance with the law of 
error. Formal causes must, therefore, be active in the organic and 
mental world. 

It was shown that a monistic hypothesis which would attempt to 
explain all things measurable by a number of identical causes, so that 
one object would differ from another only quantitatively but not 
qualitatively, and no object would be configured by any formal cause, 
would demand perfect correlation between all functions and organs 
measurable and for this reason alone would be untenable. 

The attempt was then made to conceive of the organic world as 
deriving from a vast number of different primordial uncorrelated causes. 
Thus some twenty generations ago there were something like a million 
ancestors for the individual of today. Each one of these ancestors 
contributes a factor towards making the individual of today what he is. 

On this hypothesis it was shown that all correlations may be 
explained by the percentage of overlap in the causes involved. 

Coming to the field of mental tests and general factors it was 
pointed out that when test values are distributed according to the 
law of error they must be distributed owing to the action of a formal 
cause. Various tests may be measures of this formal cause. It is 
not likely that any one test will embrace all the ultimate infinitesimal 
efficient causes that produce the thing measured which is configured 
by the formal cause. It is not likely that any one test will embrace 
exactly the same ultimate causes as any other test. Unless one test 
embraces in its underlying general factor exactly the same causes 
as another test, the correlation between the general factor in one test 
and that in the other will fall short of unity. 
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The attempt was then made to correlate general factors which 
should be identical. Thus in a table of intercorrelations with ten 
variables (the Otis Group Intelligence Scale) the general factor for the 
odd numbered tests was correlated with that for the even numbered 
tests. The correlation between the two general factors fell short of 
unity. The test that contributed least in each group was then omitted, 
the general factors in the paired groups of four tests each were corre- 
lated and the correlation fell still further short of unity. When the 
test that contributed most to each group was omitted the lack of 
identity between the two general factors became still greater. 

Various other situations were tried and in all cases the correlations 
between the general factors was less than unity. 

The empirical results therefore confirm the theoretical expectations. 
Theoretically the representatives of a general factor in one test should 
differ from the representatives of that factor in another test. Mathe- 
matically the general factor in one group of tests was found in all the 
cases investigated to be different from that in its fellow group, although 
conditions were chosen so as to be most favorable for identity. 

Although one general factor as measured differs from another gen- 
eral factor, logically and really each may be expressions of the same 
thing, the same formal cause by which the objects measured are 
configured. 

The identity of general factors is not to be determined solely by 
statistical results. The tests used must be studied and analyzed. 
Tests for eyesight do not measure hearing. Therefore, in constructing 
tests and studying the results of testing the fundamental problem 
must always be a careful study of the things measured. 

In dealing with general factors, as measured by different groups of 
tests, we must ask: 

1. Is there a formal identity between the general factor in one group 
and that in the other? This may be assumed to be the case when on 
combining the intercorrelations of the two groups the tetrads are all 
zero within the limits of sampling error. 

2. To what extent do the ultimate causes, involved in the general 
factor represented by one group of tests, overlap with those represented 
by another group of tests? This is determined by correlating the two 
general factors. 

It has been empirically shown that, though two groups of tests 
may be configured by one and the same formal cause or general factor, 
the ultimate configured causes do not entirely overlap. 
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THE PERSISTENCE OF ERRORS IN SUCCESSIVE 
TRUE-FALSE TESTS 


ROBERT T. ROSS AND MARJORIE PIRIE 
Pomona College 


The statement is often made with regard to True-False tests, that 
students tend to persist in misconceptions which arise from giving 
wrong answers when taking the tests. Thus Cocks! states, ‘‘We are 
all familiar with the old maxim: Never present false or incorrect state- 
ments to the class. Most educators probably would agree that the 
repeated presentation of false statements to a child would result 
disastrously in erroneous learning.”” Similarly Remmers and Rem- 
mers? preface their study with the remark that, “‘One of the objections 
frequently raised against true-false statements is the old pedagogical 
maxim that no false association should ever be presented to the 
learner.”’ 

Although both of these studies tend to refute the reported asser- 
tions, their conclusions are reached by studying the teaching values 
of the test, or by comparing it with other types of test. It has seemed 
advisable to the writers to determine the correctness of the supposition 
by direct measures of the retention of erroneous ideas arising from the 
true-false test itself, and to discover, if possible, the relative values of 
various methods of administration. . 

To this end, three tests were constructed of twenty items each. 
The questions covered material not generally known and not easily 
attainable during the course of ordinary academic life. The following 
selection of typical questions is taken at random from the three tests. 


..-A Berm is a form of protozoa. 
....A Solenoid is a sun-spot. 
..Diastrophism is a phenomenon which causes harbors. 
...Momus was the god of mockery among the ancients. 
..““Low-ball”’ is a term in cricket. 
....The mimber was an ancient stringed instrument. 
.... The nearest star is Centauri. | 
..Cervantes wrote “‘The Cid.” 
....A Satrap was a viceroy of ancient Persia. 
.. The Gutenberg Bible was printed in German. 


It was proposed to give each test to the same group for three suc- 
cessive times with a procedure unique for each test, but not for each 
administration. 
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PROCEDURE 


Test A was given on the Mondays of three successive weeks to a 
class of seventy college sophomores. When the class had finished the 
test, the papers were collected and academic work began. The 
procedure remained the same for all three administrations. 

Test B was given on the Wednesdays of three successive weeks. 
At the conclusion of the testing the papers were collected and the 
instructor then read each item and pronounced it “true” or “‘false”’ 
without further comment. 

Test C was given on the Fridays of the aforementioned three weeks. 
At the conclusion of the testing, the papers were called in and the 
instructor discussed each question in some detail, answering questions 
and stimulating whatever discussion was forthcoming. For example, 
“The Gutenberg Bible was not printed in German but in Latin. The 
Book was printed at the first printing press: that of Gutenberg, which 
was first operated in Germany about 1453. The Bible was probably 
printed about 1457.” This same discussion was repeated after each 
administration. 

From each of the three sets of each test the papers belonging to 
the same individual were removed and the number of errors which 
persisted from the first testing (I) to the second (II), and from the 
second (II) to the third (III), were recorded. The errors which 
persisted through all three sets (I-III) were also noted. A record was 
also kept of the number of new errors which appeared at each testing. 

From these data it was possible to determine whether identical 
errors tended to persist, and under which of the procedures outlined 
above the persistence was most common. 


RESULTS 


It was found that the number of papers for which all three adminis- 
trations were at hand varied somewhat. For Test A, N = 60; for 
Test B, N = 56; for Test C, N = 48. 

Table I shows the mean scores (number of items right) and 
the probable errors of the distributions for each test and for each 
administration. Table I indicates that the tests were of approximately 
equal difficulty, and shows at once something of the merits of the three 
methods. 

Table II shows the average number of errors which persisted from 
one administration to the next, and through all three, for each test. 
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Taste I.—Megan Scores (NumsBeEr Ricat in Twenty Items) AND THE PROBABLE 
ERRORS OF THE DISTRIBUTIONS 











A(N = 60) B(N = 56) C(N = 48) 

{(no answer) (answers only) (discussion) 
M PE M PE M PE 
I 9.38 0.35 11.00 0.31 9.75 0.29 
II 9.95 0.34 16.57 0.25 17.10 0.24 
III 10.62 0.29 18.02 0.19 18.79 0.18 























TasLe I].—Meran Persistent Error AND PROBABLE ERROR OF THE 
DiIsTRIBUTIONS 





A(N = 60) | B(N = 56) | C(N = 48) 

















TandII.........................| 7.36 | 1.43 | 2.16 | 0.90 | 2.04 | 0.92 
IJ and III........................] 7.50 | 1.40 | 1.21 | 0.67 | 0.90 | 0.70 
Tand III........................] 7.20 | 1.55 | 1.19 | 0.73 | 0.70 | 0.61 
Ito III..........................}| 5.82 | 1.47 | 0.73 | 0.62 | 0.54 | 0.67 














To be read: in Test A, there were 7.36 errors common to administrations I and 
II with a probable error of 1.43; and 5.82 errors which occurred in I, II, and III, 
i.e. I to III. 

Table III shows the differences in the average number of errors 
which persisted in each type of test for each administration, and the 
critical ratio of reliability as given by Garrett: D/PEgig ; where a ratio 
of four indicates satisfactory reliability. 


TaBLe III.—DiIFFERENCES BETWEEN MEAN PERSISTENT Errors FoR Eacu TESt 
AND FOR Eacu ADMINISTRATION AND THE CRITICAL RATiI0s OF RELIABILITY 





A — B| D/PEa| B — C| D/PEa| A — C| D/PEa 























TandII...................]| 5.20] 19.25 | 0.12] 0.50] 5.32 | 23.11 
II and III..................] 6.29 | 31.12 | 0.32] 0.37) 6.60 | 31.90 
Tand III..................] 6.01 | 27.15 | 0.48 | 3.70| 6.49 / 29.61 
Ito III....................| 5.09 | 24.31.; 0.19] 0.19] 5.28 | 25.12 





To be read: Of the errors persisting from administration I to administration II, 
5.20 more errors persisted under method A than under method B. This difference 
is approximately five times as reliable as necessary (19.25/4). 


Table IV shows the occurrence of new errors; that is, errors occur- 
ring in the second administration which were not present in the first 
administration, etc. 
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TaBLE IV.—Mgan New Errors AND PROBABLE ERRORS OF THEIR DISTRIBUTIONS 





A(N = 60) | PE| B(N = 56) | PE 





C(N - 48) | PE 

















SeebTh.......c.c..004 8 2 11S ae See ee 
Iland III...............) 2.07 0.92 1.16 (0.541 0.39 (0.44 
See oS, ie be ns 0.59) 0.48 (0.47 





To be read: With Test A, there were 2.75 new errors in administration II which 
did not occur in administration I. 


With all tables it must be kept in mind that the number of items 
in each test was twenty. To express the results on the basis of one 
hundred it is, of course, necessary to multiply by five. 

Tables II and III clearly show that more errors tend to persist 
with Test A than with either Test B or Test C, and that the differences 
indicated by the tables are significant as far as the differences between 
Test A and Test B, and Test A and Test C are concerned; but are not 
significant for the differences between Test B and Test C. This would 
indicate that a distinct improvement in results is to be had by giving 
the answers to the tests immediately after administration, but that 
there is little to be gained by discussing the test items. 

Table IV shows, on the other hand, that there is a slight but signifi- 
cant advantage in discussion inasmuch as it tends to prevent the occur- 
rences of new errors in the subsequent tests. 

It seems reasonable to maintain, from the data presented, that 
although the True-False test is open to certain limitations as far as 
its efficiency as a teaching device is concerned, the current belief in its 
detrimental value in inculcating false information as a result of student 
mistakes depends entirely on evidence gained from administering the 
test without giving the answers; and that when the answers are given, 
with or without discussion, errors do not tend to persist. 


SUMMARY 


Three different True-False tests of twenty items each were given 
three times at intervals of one week to a class of seventy college stu- 
dents. The first test was always given without comment. The second 
test was followed by a list of answers. The questions of the third test 
were discussed after each administration. 

The number of errors which persisted from one administration to 
another, and the number of new errors at each administration were 
tabulated and the methods compared on these grounds. 
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It was found that more errors tended to persist when no answers 
were given than when the answers were read or discussion allowed, but 
that there was no significant advantage of discussion over answers, 
or vice versa. 

New errors tended to appear more frequently when answers were 
not given than when they were, but the number of new errors remained 
practically the same throughout the various administrations of any 
one test. There appears, however, a slight but significant value in 
discussion inasmuch as it tends to forestall the occurrence of novel 


mistakes. 
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THE INFLUENCE ON LEARNING AND RETENTION 
OF WEEKLY AS OPPOSED TO MONTHLY TESTS 


NOEL KEYS 
University of California 


The literature on learning is replete with studies demonstrating that 
knowledge by the learner of his own progress is of the utmost impor- 
tance in the motivation of improvement. Quite as numerous are 
experiments which reveal the superiority of distributed over concen- 
trated learning. An obvious means of insuring the operation of these 
two factors in a typical school situation would be the administration of 
examinations with more than ordinary frequency. It might be sup- 
posed that educational literature would include many investigations 
under this head. Virtually all reported studies, however, involve the 
use of tests and test results for direct instruction, thus introducing 
additional variables into the situation. Moreover, the writer is aware 
of no experiment in which the influence of frequent as compared with 
infrequent testing has been studied, uncomplicated by differences in 
the amount of test materials employed. The following represent the 
nearest approximations. 

The pioneer experiment is that of Jones,? who discovered that 
classes tested immediately after each lecture profited so greatly thereby 
that, after eight weeks, almost twice as much of the content was 
retained as in the case of material not thus examined. 

In 1931, Turney* found that forty students of educational psychol- 
ogy who had started the semester with a mean pre-test score of 85.2, 
versus 108.1 for a control section of equal intelligence, had completely 
overcome this handicap so as to score equal to the controls on the 
final examination. The only reported difference in the treatment 
of the two sections was the giving of thirteen intermediate tests to 
the experimental, as contrasted with two to the control group. The 
difference in pre-test scores amounted to 3.6 SD, and that in the 
gains to 2.8, but the experiment suffers from the fact that the two 
sections differed so widely in initial knowledge of the subject. 

Kulp* gave a class of graduate students in educational sociology a 
ten-minute objective test each week for the first half of the course. 
Those who showed above average standing at mid-semester were 
excused from such tests for the remainder of term, while the rest con- 
tinued as before. The superiority of this upper half over the lower, 
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which had amounted to thirty-nine per cent at mid-term, fell on the 
final examination to five per cent, or 0.9SD. His numbers, however, 
were small, being but fourteen and eighteen for the two groups, and 
statistical regression alone would account for a goodly part of the 
decline observed. 

Hertzberg, Heilman, and Leuenberger! report that groups in 
educational psychology provided with extensive practice on objective 
tests as instructional aids, displayed a significant superiority over con- 
trol sections, which amounted to twelve or fifteen per cent on mid-term 
tests. When, however, a final examination was given both sections, 
unpreceded by review with practice test materials, the experimental 
groups scored no higher than the controls. 

Lastly may be mentioned the experiment of C. H. Smeltzer, as 
summarized by Pressey.‘ His subjects were two sections of a class 
in educational psychology, meeting five times weekly and numbering 
seventy-six and eighty-eight members, respectively. The experimental 
section was given a weekly test each Thursday and the papers returned 
and ‘gone over” in Friday class. Those whose marks had been 
satisfactory were excused from the Monday session, while those who 
had done poorly took a re-test on that day. On the final examination, 
the experimental group showed a score of 230.6 as against a median of 
219.3 for the control, after correction for differences in the pre-test 
scores of the two groups. 

The study which constitutes the subject of the present article is 
distinguished from all of the foregoing in that tests administered to 
the two sections were identical, both in content and total amount, 
differing only in that the experimental group took these in brief weekly 
installments, and the control in the form of long mid-term examinations. 
The experiment differs further from the two last-named, in that no 
attempt was made to utilize the tests for purposes of direct instruction. 
Papers were not returned or ‘“‘gone over”’ in class, nor were quiz 
sections of any sort provided. Finally, the numbers involved are from 
two to ten times those in previous investigations. 


EQUATING OF GROUPS IN THE PRESENT STUDY 


The class involved was one in educational psychology at the Univer- 
sity of California in the spring semester of 1934. The two sections 
totalled three hundred sixty students, but after rejecting members 
absent from one or another of the tests and those for whom no matching 
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was possible, there remained one hundred forty-three in each experi- 
mental group. These reduced groups were most carefully matched, 
person for person, as to (1) sex, seventy per cent being women, and 
(2) scores on a pre-test administered as soon as enrollment was com- 
plete. This pre-test consisted of one hundred sixty-seven true-false 
statements, representing a careful sampling of the subject-matter of 
the course. The items included had previously been validated by the 
elimination of those failing to show a higher percentage of errors in 
papers of the poorest as contrasted with the best third of a class at 
Cornell University. One hundred eighteen of the one hundred sixty- 
seven items dealt with material to be covered in the first two thirds of 
the course. These one hundred eighteen, which were subsequently 
readministered as an end-test, will be designated Test A. Table I 
shows the nearly identical performance of the two groups, both on 
Test A and on the entire pre-test. 


TaBLeE I.—EQuaALITY OF EXPERIMENTAL AND CONTROL Groups IN ScoRES ON THE 
PrE-TEST AT OPENING OF TERM 
(Scores Are on the Basis of Right Minus Wrong Answers) 











Test A (one hundred 
eighteen items cover- Petar. pomaihoees 
ing first two-thirds of ret 
Section N the course) stoma) 
Mean SD Mean SD 
Experimental........... 143 18.1 13.9 25.7 15.9 
ES: 18.2 13.7 25.6 16.0 




















Each section met for three periods a week under the writer’s 
instruction. Class periods were devoted almost exclusively to lectures, 
save for time consumed in testing. The same lecture hall was used for 
both groups, and great pains were taken to keep the instruction iden- 
tical. The control section, however, met after the experimental, 
and probably enjoyed a certain advantage on that account. To 
offset this as far as practicable, one-fourth of the lectures and one-half 
of the total tests and examinations were scheduled to be given first to 
the controls. A further possible advantage of the control group 
lay in the fact that fourteen per cent of its members were graduate 
students, as against six per cent of the experimental. 
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PLAN OF THE EXPERIMENT 


The fifteen weeks of the spring semester prior to final examinations 
were divided into three equal parts. Omitting holidays and the first 
week, which was occupied with class organization, and administration 
of the pre-test and questionnaire, each part contained the equivalent 
of four weeks of three class meetings each. For the first of these 
divisions, the experimental section was furnished with a mimeographed 
sheet indicating the topics and chapters of reading assigned for each 
week, and the date of the weekly test covering each assignment. The 
control section, however, received only a lump assignment covering 
the work of the four-week period, with the date of the mid-term test to 
follow. For the second division, both sections received weekly assign- 
ments, but the experimental group continued to be tested each week 
while the controls again took but one monthly examination. For the 
third part of the course, both groups had a single monthly examination, 
but the control received weekly assignments while the experimental 
group did not. 

Examinations were based one-third on the lectures, and one-third 
on each of the two textbooks which constituted the sole assigned read- 
ing. The content of the monthly examinations given the control 
group was identical with that taken by the experimental group in 
weekly installments. All tests were strictly objective. The periodic 
tests consisted of true-false and completion items in the ratio of seven 
to one. The pre-test and final examinations were exclusively true- 
false. All true-false sections were scored on the basis of the number 
of right answers minus the number of wrongs. 

Following each of the weekly or monthly tests a complete distribu- 
tion of scores was posted, showing the exact position of each student 
relative to the other members of his section. Papers, however, were 
not returned, nor were quiz or review sessions conducted.' 

On the next to the last class meeting, five weeks after the last test 
over Part 2, and two weeks before the final examination, Test A was 





1 In order to maintain as ‘“‘natural” an atmosphere as possible and prevent the 
suspicion that an experiment was in progress, the instructor did not refuse to 
state the correct answer to items on the weekly or monthly tests when students 
asked for these from memory in lecture sessions following an examination. Less 
than ten per cent of the questions, however, were so discussed. Similarly, students 
who sought permission to see their corrected papers were allowed to do so, but the 
time and place were intentionally made so inconvenient that, on the average test, 
but one in ten availed himself of the privilege. 
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readministered to both sections. This, as will be recalled, consisted of 
one hundred eighteen items covering subject-matter of the first two- 
thirds of the course. This test was given entirely without warning, 
and so constituted an uncommonly fair measure of comparative 
retention by the two groups after a lapse of five weeks without special 
review. The final examination consisted of Test B, a form parallel to 
Test A, of approximately the same length, and dealing only with the 
first two parts of the course. Reliabilities of the separate forms were 
.84 and .70, respectively, or .89 for the composite final examination. 
Table II makes clear the differences in programs of the two sections. 


TaBLeE II].—DIFFERENCES IN THE PROGRAM OF ASSIGNMENTS AND TESTING IN 
EXPERIMENTAL AND CONTROL SECTIONS 








Division of the course Experimental section Control section 
Introduction............ Pre-test and preliminary questionnaire. 
Part 1..................| Specific weekly assign-| Monthly assignment and 


ments and four weekly| single monthly examina- 
tests, Forms la, 1b, lc,| tion, Form 1. 

and ld. 
Part 2..................| Specific weekly assign-| Specific weekly assign- 
ments and four weekly} ments, with a single 
tests, Forms 2a, 2b, 2c,| monthly examination, 





and 2d. Form 2. 

Part 3..................| Monthly assignment, with | Specific weekly  assign- 
a single monthly exami-| ments, with a single 
nation, Form 3. monthly examination, 

Form 3. 

Closing sessions. ........ Questionnaire readministered. 

Test A readministered without warning, as end-test. 

Fina] examination.......| Test B, paralleling Test A and treating the subject- 


matter of Parts 1 and 2 only. 








RESULTS OF THE PERIODIC AND FINAL TESTS 


Scores of the two groups on the various periodic and final examina- 
tions may be seen in Table III. Figures shown for the experimental 
section on Parts 1 and 2 are in each case the sum of scores on the four 
weekly tests. The score of the control group is that made on the 
monthly examination, composed of the same items. All totals have 
been obtained by simple addition, without correction for differences in 
variability of scores. This serves to give greatest weight to perform- 


ance on Part 1, in which the experimental factors were operating to 
maximum effect. 




















SS, 





= GR to Fee ene aE ae 


LSE A aa SPR as ES 


ce 





oe ei 





ee Eee Se | 


i 
a4 
P ee) 
‘ 
ie 


fy 
4 
uM 
¥ 


i 


aed 
ver 


a ra onde 
= 
ee : 


> 


ey AE SS 


432 The Journal of Educational Psychology 


Tasie III.—Scorss or EXPERIMENTAL AND CONTROL GROUPS ON PERIODIC AND 
Finat EXAMINATIONS 




















Number|Experimental group| Control group 
Tests of 
items | Mean PEw|SD/| Mean PEw | SD 
Periodic tests. ! 
Ii te in aks ws ds 0d ta hs 124.5 + 1.2 |20.8| 110.8 + 1.2 |21.6 
i neg ile ds Seni bn 74.8 +1.0/18.1) 66.7 +1.1 /19.1 
RN a dc wictecksd 4 vans sds an eee 78.7 +1.2 |20.9| 74.0 + 1.2 /21.1 
ER 278.0 + 2.8 |49.0) 251.5 + 3.0 |52.4 
Final examination. 
Test A (unmannounced)...... 118 64.4 + 1.0 17.7; 60.2 + 1.0 |17.1 
Test B (announced)........ 122 66.4 +0.9 |16.3) 66.5 + 0.9 {15.3 
| EAE pI 130.8 + 1.7 |80.9) 126.7 + 1.7 |29.3 
Total term score.............| 752 408.8 + 4.3 |76.5| 378.2 + 4.5 |79.8 




















1 Owing to its length, this test was administered to the control group in two 
installments at successive class meetings. 


TaBLE [V.—EXTENT AND SIGNIFICANCE OF DIFFERENCES BETWEEN GROUPS 

















a Difference 
experi- , Chances 
Tests — l ~ mental PEaitt. eee in one 
ee minus ‘ hundred 
pated control 
score 
Periodic tests. 
UG ee ca ab doce cal 1.12 13.7 1.7 8.1 100 
a peeing s Be 1.12 8.1 1.5 5.4 100 
I Sate ght pas es 1.06 4.7 1.7 2.8 97 
CN cs eet ose e 1.11 26.5 4.1 6.5 100 
Final examination. 
Test A (unannounced). 1.07 4.2 1.4 3.0 98 
Test B (announced)... 1.00 —0.1 1.3 —0.1 53 
eae cs Oe ete want 1.03 4.1 2.4 1.7 87 
Total term score........ 1.08 30.6 6.2 4.9 100 
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Table IV indicates the ratio of scores made by the experimental 
group to those of the control, and the significance of differences found.! 
From column 1 it will be noted that on both Parts 1 and 2, the experi- 
mental section taking weekly tests scored twelve per cent higher than 
the controls, and this difference is highly reliable. On Part 3, when 
both groups took but one test, the experimental section continued to 
lead by six per cent. The latter difference, amounting to but 2.8 PE, 
may be due to chance alone, or it may reflect a certain carry-over of 
better study habits established during the time when weekly tests were 
given. It is noteworthy that the giving of specific weekly reading 
assignments to the control section during Parts 2 and 3 produced no 
discernible improvement in the showing made by them relative to the 
experimental group, or to their own performance on Part 1. 

On the end-tests, the experimental group proved to have retained a 
part of the superiority shown on the periodic examinations. Their 
score was seven per cent above the controls when Test A, treating of 
the first two sections of the course, was readministered in an ordinary 
class period without warning. When, however, the two groups came 
prepared for the regular final examination (Test B), they scored equal 
to each other. On the total term score, in which the three mid-term 
tests have a combined weight of five-eighths as against three-eighths 
for the final examination (see relative sigmas), the experimental group 
continued to lead by eight per cent. This may be compared with the 
advantage of twenty-one (?) per cent as found by Turney, eleven per 
cent by Smeltzer, and zero to fifteen per cent by Hertzberg and asso- 
ciates. But it must be remembered that all of these investigators 
provided not merely more frequent, but a very much greater total 
volume of tests for the experimental sections than for their controls, 
and all save Turney returned test papers for correction and review. 


STUDENT OPINION ON TESTS AND ASSIGNMENTS 


At the opening of term both sections were given an extensive ques- 
tionnaire touching some thirty issues of educational theory and prac- 
tice. In the next to the last week of term they were asked to fill out a 
second blank in order that changes in opinion might be noted. All 





1 The final column states the chances in a hundred that the difference obtained 
marks a real distinction in the performance of the two sections and is not due 
merely to chance, or errors of sampling. Differences of four or more times the 
probable error are regarded as denoting practical certainty, since such would 
occur only one time in three hundred by chance alone. 
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were assured that answers would in no way affect grades assigned in 


the course. 


Two of the questions contained bore upon the experi- 


TaB_Le V.—OPINION OF CLAss AT OPENING AND CLOSE OF TERM AS TO BEST PLAN 


OF ASSIGNMENTS AND THE FREQUENCY OF TESTING DESIRABLE 





Statement in questionnaire 





Percentage of students favoring this 














alternative 
At start | At close PEast Difference 
of term | of term a PEaitt. 





I feel that I would gain more real and lasting benefit from a course of this 


sort if tests or examinations: 


Were given at almost every es of 
the class. . 

Were given at every second, ‘third, or 
fourth meeting. . 

Were given three or ‘four times a 
semester. . : : 
Were given ‘caly once or "twice a 

semester. . + : 
Were confined to a single, compre- 
hensive final examination.......... 
Were eliminated altogether, and credit 
based solely on term papers or other 
EG sean ce 








5.6 7.6 
45.8 59.2 
37.0 24.0 
1.6 2.0 
0.8 1.2 
9.2 6.0 
100.0 | 100.0 








1.5 


3.0 


2.7 


0.8 


0.6 


1.6 





+1.8 
+4.5 
—4.8 
+0.5 


+0.7 


—2.0 





As a university student, I feel that I get more benefit from a course of this 


sort when assignments are made: 


For only one or two periods at a time. 

For three or four class meetings at a 
time. a 

With the work of each day gpecifically 
indicated, but posted for weeks or 
months in advance. . 

In the form of large blocks of work to 
be covered by a given date; e.g., for 
a mid-term examination... .. 








15.3 8.0 
36.1 | 19.2 
33.7 48.4 
14.9 24.4 
100.0 | 100.0 








1.9 


2.7 


2.9 


2.4 





—3.8 


—6.3 


+5.1 


+4.0 





mental factors of the present study, namely the effect of frequent test- 


ing, and of weekly as contrasted with monthly assignments. 
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shown in Table V are those from a sampling of two hundred fifty 
students at the outset of the course and the same individuals at the 
close. 

As regards frequency of examinations, it will be observed that there 
occurred in the course of the semester a highly significant growth in 
conviction that monthly examinations are less advantageous than 
tests given weekly, z.e., at every second, third, or fourth meeting of the 
class. This subject was not discussed by the instructor, nor were 
students informed of the experiment in progress. The number who 
would dispense with examinations altogether declined, likewise, but 
this latter difference is statistically unreliable. 

With reference to the presentation of assignments, an even more 
significant change of sentiment may be noted away from the setting 
of tasks for a few days at a time, and in favor of assignments made for 
weeks and months in advance. This outcome was unforeseen by the 
writer. The student attitude appears defensible, however, inasmuch 
as the experimental and control groups reveal no differences which can 
fairly be attributed to the timing of assignments per se. In other 
words, there is no indication that assignments specifying the reading 
to be completed by a particular day or week were in any way more 
effective than those which merely defined the total work to be covered 
in a four or five week period, save as the former were accompanied by 
periodic tests at the intervals named. 


SUMMARY 


The present study endeavors to determine the influence on learn- 
ing and retention of frequent as contrasted with infrequent testing, 
apart from differences in volume of tests administered, or the use of 
test materials as teaching aids. It is based on the performance of two 
sections in educational psychology, numbering one hundred forty-three 
students each, and matched for sex and initial knowledge of the subject. 
The findings indicate that, under conditions of this experiment, 

1. The same tests administered in the form of weekly rather than 
monthly examinations result in a mean performance which is higher by 
twelve per cent; and this difference has high statistical significance. 

2. Retention by the weekly-tested group is some seven per cent 
superior, as measured by a comprehensive examination given without 


warning, from five to thirteen weeks after the corresponding periodic 
tests. 
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3. On the regular final examination, however, taken after the usual 
intensive preparation or ‘“‘cramming,’’ no such differences appear. 

4. Without comment by the instructor or knowledge of the experi- 
ment in progress, students disclose a strong and growing conviction 
of the desirability of tests given as frequently as every second, third, 
or fourth class sessioi.. 

5. Assignments specifying the reading to be covered on a particular 
day or week evidence no superiority over assignments in terms of 
monthly blocks only, except when the former are accompanied by 
periodic tests at the intervals named. 

6. Students express marked preference for assignments covering 
periods of several weeks in advance. 

7. In view of the total superiority of eight per cent manifested by 
the experimental group of the present study, it appears that much, if 
not most, of the learning gains found by such experimenters as Hertz- 
berg, Smeltzer, and Turney can be had without material increase in the 
volume of tests normally administered, merely by giving these tests 
in smaller and more frequent installments. 
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NORMALCY AS A STATISTIC 


JOHN W. DICKEY 
State Normal School, Newark, N. J. 


The description of a distribution of test scores is reported most 
often in tabled frequencies, measures of central tendency such as 
M, or Md, measures of variability such as SD, PE, Q, or AD, and 
measures of the reliability of these two types of statistics. Not 
infrequently the tabled frequencies are accompanied by a graph such 
as the frequency polygon or the histogram. Less often the measures 
of skewness, kurtosis, and goodness of fit are given. 

Each of the above parameters serves as a partial description 
of the observed distribution, and the theoretical distribution of which 
the observed one is a random sample. The measures of skewness, 
kurtosis, and goodness of fit possess the additional attribute of making 
a@ quantitative comparison between the observed distribution and 
the theoretical distribution;! whereas, the graph (frequency polygon 
or histogram) makes a qualitative comparison between the two 
distributions. The graph describes qualitatively the distribution of 
scores as a whole. Psychologically speaking, the whole seems to be 
more than the sum of the parts. The known parameters describe 
the distribution of scores in parts, and not asa whole. A quantitative 
measure of the distribution as a whole is, therefore, lacking. To 
quantitize the graph is to supply this statistic. 

As evidence that there is a need for a quantitative measure of the 
observed distribution which will permit us to think of it as a whole, 
and in relation to the Normal Curve as a whole, we hear research 
workers say after the known parameters have been given and the 
graph has been drawn, that, “the distribution is somewhat normal,” 
that it is, ‘very close to the normal distribution,” that, “it is fairly 
normal,” that, “this distribution is more nearly normal than that one,” 
that, ‘“‘I got an almost perfect Normal Curve because it piled up so 
nicely,” etc. Other similar statements indicate a need for some 
quantitative measure to supplant these biased qualitative descriptions 
resulting from the insufficiency of the already known parameters and 
the graph to describe the frequency distribution as a whole. 





1 Fisher, R. A.: Statistical Methods for Research Workers, Fourth Edition- 
Revised and Enlarged. Oliver and Boyd, London, 1932, p. 80. 
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This paper aims, first, to define a quantitative measure of the 
observed frequency-distribution when it is thought of as a whole 
and in relation to the Normal Curve, secondly, to demonstrate how 
this measure may be computed, and thirdly, to validate an easy 
method of computation which yields a close approximation to the 
more exact value. 

This quantitative measure shall be designated the Normalcy 
(G) of the distribution. This statistic (G@)! measures in percentage 
that proportion of the observed distribution that falls under the 
superimposed Normal Curve which has an M-value and a o-value 
equal to that of the observed distribution. 

TaBLE I.—THe OBSERVED FREQUENCIES, THE THEORETICAL FREQUENCIES OF THE 


SuPrerimposepD NorMAL CuRVE, AND THE SMALLER FREQUENCIES WHICH 
Autways Stay WITHIN THE BoUNDS OF THE SUPERIMPOSED NORMAL 























CuRvE 
Observed Theoretical Smaller 
Scores frequencies frequencies frequencies 

(fo) (f:) (f.) 
48-50 4 1 1 
45-47 4 3 3 
42-44 8 6 6 
39-41 8 10 8 
36-38 15 14 14 
33-35 16 15 15 
30-32 15 14 14 
27-29 9 11 9 
24-26 7 7 7 
21-23 1 4 1 
18-20 1 2 1 
15-17 3 1 1 
NGS eleva tubs od wing 88 88 80 

M = 34.3 = 34.3 

¢= 69= 6.9 


The procedure for the calculation of Normalcy (G@) will now be 
demonstrated. The method? involves the computation of the ordi- 





1 The letter (G) is used because it is a measure of the distribution as a whole; 
and this whole has been brought to the focus of attention by the Gestalt school 
of psychology. 

2 Yule, G. Udny: An Introduction to the Theory of Statistics, Eighth Edition, 
Revised. Charles Griffin and Company, London, 1927, p. 307. 
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nates (frequencies) of the superimposed Normal Curve which has 
the values of N, M, and gc, in common with the observed distribution. 
These theoretical ordinates (f;) will be compared in a particular way 
with the observed ordinates (f,). The data in Table I are the result 
of the administration of the writer’s classroom test to eighty-eight 
students, and will illustrate the concept of Normaley (@). The 
theoretical frequencies are the ordinates (computation to be explained 
presently) of the superimposed Normal Curve corresponding to those 
in the observed distribution. The shorter frequencies (f,) are those 
which always stay within the bounds of the Normal Curve; and the 
summation of these shorter frequencies is the population which falls 
within this boundary. By definition the value of (@) is given by 
the equation, 


= —GLOCiS= 71, 1 
wane N a 
The Normalcy value for the observed distribution in Table I is, 
_ 100 X 80 | 
G = —e- = 91 per cent 


The value of G = 91 per cent means that ninety-one per cent of the 
observed distribution falls under the superimposed Normal Curve, 
or that nine per cent of it falls outside of the Normal Curve. 

The standard error! of (G@) is, 


og = 7 (2) 


where, in our example, p = 91 per cent, g = 9 per cent, and N = 88. 
The Normalcy value may therefore be written as, 


G + oq = 91 per cent + 3.1 per cent 


‘he computation of the theoretical frequencies (f;) found in Table I 
will now be explained briefly.2 After the two parameters, M = 34.3, 





1 This measure of the sampling error includes the measure of Goodness of Fit 
as an important special case. Curves other than the Normal Curve may be used 
when appropriate, and the concept of (@) called by another name is most general 
in application. 

* This computation is also explained by Rugg, H. O.: Statistical Methods Applied 
to Education. Houghton Mifflin Company, New York, 1917, p. 211. 
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and o = 6.9, have been computed for the observed distribution, we 
next compute the ordinate (y.) erected at the M of the distribution. 
This is done by substituting the values of N = 88, o = 6.9, and 
7 = 3 (width of class-intervals)! in the equation 


ee: 
Yo = 3 50/t 


The value of (y.) for our distribution is 15.3. The next step is to 
compute the ordinates (f,’s) at the mid-points of the class-intervals 
(16.5, 19.5, 22.5, . . . 49.5). This is done by evaluating the dis- 
tances (in c-units) of each mid-point from the M = 34.3 value; and 
these values are then translated to ordinate-values read from such a 
table as Rugg’s Table II.2 Each of these tabular values (ordinates) 
is then multiplied by y, = 15.3. For example, the 31.5 mid-point 
differs from 34.3 by 2.8 (disregard signs). The 2.8 expressed in o-units 
becomes 2.8/6.9 or 0.41, and with this z/o-value we enter Rugg’s 
Table II. The corresponding ordinate equals 0.92 (the length of the 
ordinate when y, equals unity). The value of 0.92 is multiplied by 
yo = 15.3 and equals 14.1 or 14, which is the (f;) at mid-point 31.5. 
The remaining (f;’s) are computed in a similar fashion. Of course, 
2f ie: Zfo. 

The two distributions—observed and theoretical—in Table I 
are shown graphically in Fig. 1. The jagged polygon in Fig. 1, is 
the observed distribution; the smoothed polygon, the superimposed 
Normal Curve. From looking at it one would probably say that, 
“the distribution is somewhat normal’’; but to say that G is 91 per 
cent, which means that 91 per cent of it falls within the Normal Curve, 
is much more definite and significant. 

Thus far the need for a measure of Normalcy (@) has been shown; 
and the computation of the more exact value of (@) together with its 
standard error was demonstrated. The remainder of this paper 
concerns itself with an easy method of arriving at a rather close 
approximation to the more exact value of (@). The more exact method 
of calculating (G) is probably more laborious than many, observed 
distributions of small populations would warrant. 

The gist of the method of approximation now to be presented, 
is to use a superimposed isosceles (equal legs) triangle as a substitute 





1 This ‘‘i” is usually not written in this equation but is always understood. 
2 Rugg, H. O.: Statistical Methods Applied to Education. Houghton Mifflin 
Company, New York, 1917, p. 388. 
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for the superimposed Normal Curve. It is surprising the degree to 
which the isosceles triangle approximates (G). In addition to its 
accuracy (the error will presently be examined), it is most readily 
applied. 

First, the isosceles triangle will be compared with the Normal Curve 
of unit area to get the (@) for the triangle. Both the areas and 
selected ordinates for z/o-values above and below the M-value are 
shown in Table II. In Table II the agreement in areas is surprisingly 
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Fic. 1.—The Normal Curve superimposed on the frequency polygon drawn from the 
observed scores recorded in Table I. 


close for all practical purposes. The area from M = 0.0 to o = 0.5 
is 0.18 for the isosceles triangle, and approximately 0.19 for the Normal 
Curve. The other values are also surprisingly close. The total area 
for the Normal Curve within the range of +2.5¢ to —2.5¢ is more than 
0.98, and the area of the triangle is 1.00 when its altitude is approxi- 
mately equal to (y.). The area-values for the Normal Curve were 
adapted from Pearson’s Table II;! whereas, the area-values for the 
triangle were computed after the value of (y.) was established. The 
yo equals 0.40 (correct to the nearest hundredths) for the Normal 
Curve. The isosceles triangle has an area of unity when the base line 
is 5o’s in length (extending from —2.50’s to +2.50’s) and the ordinate 





1 Pearson, Karl: Tables for Statisticians and Biomeiricians. Part I, Third 
Edition, 1930. 
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is set at the mean equal to 0.40. Other corresponding ordinates are 
given in Table II as z’s (also adapted from Pearson’s Table II). 
The close agreement found between the Normal Curve with N = 1 
and the isosceles triangle (with an altitude equal to y, and a base line 
extending from —2.50 to +2.5¢) will be found for any value of N 
because of the similarity of the corresponding figures. The data 
in Table II are graphically presented in Fig. 2. 


TaBLeE II.—A CompaRISON OF THE AREAS AND 2’8 FOR 2/o-VALUES FROM THE 











MEAN 
Normal | Isosceles Normal | Isosceles 
2 Smaller A Smaller 
z/e curve | triangle curve | triangle 
area (z) 
area area (z) (z) 
—2.5 ice Pp ae 0.02 0.00 0.00 
0.01 0.02 0.01 
—2.0 caret eae sia 0.05 0.08 0.05 
0.05 0.06 0.05 
—1.5 shot Giete REF a 0.13 0.16 0.13 
0.09 0.10 0.09 
—1.0 estan a os | Oe 0.24 0.24 
0.15 0.14 0.14 
—0.5 Lan das took ile 0.35 0.32 0.32 
0.19 0.18 0.18 
0.0 er on aia 0.40 0.40 0.40 
0.19 0.18 0.18 
0.5 alien jes petits 0.35 0.32 0.32 
0.15 0.14 0.14 
1.0 egies ue ews 0.24 0.24 0.24 
0.09 0.10 0.09 
1.5 aie ay Sarat 0.13 0.16 0.13 
0.05 0.06 0.05 
2.0 cues PEN oe 0.05 0.08 0.05 
0.01 0.02 0.01 
2.5 sida Kans pers 0.02 0.00 0.00 
AT ier as ee nag 0.98 1.00 0.94 1.98 2.00 1.88 























The Normalcy (G) is represented in Fig. 2 by the lined portion. 
In Table II this was computed in two ways. When the ‘Smaller 
Area”’ column was summed (0.94) and divided by the area (0.98), 
we get G = ninety-six per cent. When the “Smaller z’”’ column was 
summed (1.88) and divided by the sum of the ‘‘Normal Curve z” 
column (1.98), we got the same value for (@). This is to be expected 
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because both “area” and “z” are functions of N. The important 
thing here is that the error made is only four per cent when the curve 
is normal. This does not mean that when (G) is estimated with the 
help of the isosceles triangle for an observed frequency-distribution 
that the error will be four per cent. In most cases it probably will 
be considerably less than four per cent due to the irregularity of the 
observed polygon. The maximum possible error would occur (in 
the opposite direction) if an observed polygon were to cut the triangle 
in two points cut by the Normal Curve—at approximately +1.0¢ 
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Fic. 2.—Showing the isosceles triangle superimposed on the Normal Curve. The 
hatched portion shows the ninety-six per cent of the triangle that is within the bounds 
of the Normal Curve. 
from the M = 0.0 (neglecting the two points near the +2.5c). If 
such a bi-modal curve (say) were to cut at these two points the maxi- 
mum, possible error would be in the neighborhood of six per cent (see 
Table II), that is, the estimated value of (@) would be six per cent too 
large. In Table II it is seen that the (@) for the triangle is ninety-six 
per cent or four per cent too small in its estimate (a conservative 
estimate). Therefore the range in the error of estimate when using 
the triangle, runs approximately from —4 per cent to +6 per cent with 
a mean value of about 1.0 per cent. In the long run most errors will 
be small when the isosceles triangle is used to estimate (@). For any 
one observed distribution, however, we must recognize this maximum 
possible error. 
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The Normalcy value (G@) for the observed distribution in Table I 
will now be estimated with the help of the isosceles triangle. After 
the M and the o are computed, the three points which determine our 
triangle are computed, namely, y, = N/(2.50/1), M+ 2.50, and 
M — 2.50. They are y, = 15.3, M + 2.50 = 51.6, and M — 2.5¢ = 
17.0. The isosceles triangle is then carefully drawn, preferably on 
squared paper, connecting these three points. The ordinates (f,) are 
TaBLeE II].—Txe Frequencies EstiMaTED FROM THE [sOscELES TRIANGLE, 


CoMPARED WITH THE OBSERVED FREQUENCIES AND THE THEORETICAL 
FREQUENCIES FouND IN TABLE I 











Observed Frequencies Smaller Theoretical 
" estimated . - 

Scores frequencies f ‘waren frequencies frequencies 

(f) rom triangle (f.) (f,) 
(fa) 

48-50 1 2.0 1 1 
45-47 4 4.5 4 3 
42-44 8 7.2 7.2 6 
39-41 8 9.9 8 10 
36-38 15 12.5 12.5 14 
33-35 16 15.3 15.3 15 
30-32 15 12.8 12.8 14 
27-29 9 10.0 9 11 
24-26 7 7.0 7 7 
21-23 1 4.9 1 4 
18-20 1 2.0 1 2 
15-17 3 0.0 0 1 
ED ocdnvene 88 88.1 78.8 88 

















then estimated and their values recorded asin Table III. In Table III 
the estimated frequencies (f,) are compared with the observed fre- 
quencies (f,), and in this table they may also be compared with the 
theoretical frequencies (f;). Z(fs) = Z(f.), approximately. When 
the triangle is used the Normalcy value may be designated by (G,) 
to distinguish it from the more exact value (G@). The (Gq) for our 
example is, 


100 x 78. 
a, = ees 


The value of (G4)! compared with (G) is in error approximately 


= 90 per cent 








1The standard error of (Ga) is probably not trustworthy due to the error 
involved in (Ga) itself. 
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—1 per cent in this example. (With another example with N =. 
82, M = 58.4, and o = 4.9, the value of (@) = (G,) = 92 per cent.) 
The error in our first example is probably due to the fact that the 
observed polygon was to a high degree normal. Other factors also 
enter here; but in general, it appears that the higher the value of (@) 
the more likely is there to be the larger error in the conservative 
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Fic. 3.—Showing the isosceles triangle and the Normal Curve superimposed on the 
frequency polygon drawn from the point scores made on the writer's class-room test by 


eighty-eight students. The irregularities of the observed polygon compensate when 
the triangle is used to estimate the Normalcy of the observed distribution. 


42 45 48 SiS 


direction. This error will always fall, however, within the approxi- 
mate limits of +5 per cent. 

The data recorded in Table III are shown in Fig. 3. The Normal 
Curve and the isosceles triangle have both been superimposed on 
the observed frequency-polygon. A study of Fig. 3 exemplifies the 
way in which compensations are made due to the irregularities in 
the observed frequency-polygon so that the value of (G,) will usually 
not be far from correct. 


CONCLUSIONS 


1. The known measures of an observed frequency-distribution 
do not give an adequate picture when the distribution is to be thought 
of as a whole. 
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2. The graphic presentation of an observed frequency-distribution 
fails to give a quantitative description when the distribution is thought 
of as a whole. 

3. To more accurately picture the observed distribution as a whole 
and in relation to the Normal Curve, the Normalcy (G@) of the observed 
distribution has been defined as a statistic supplementary to those 
already known. 

4. The computation of the more exact value of (@) together with 
its approximation (G,), was demonstrated. 

5. Both the formula for the standard error of (@), and the nature 
of the error involved in (G,) are reported. 
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THE EDUCATIONAL AND VOCATIONAL STATUS OF 
UNIVERSITY OF MINNESOTA STUDENTS HAVING 
LOW COLLEGE APTITUDE RATING 


CHARLES W. BOARDMAN AND FRANK H. FINCH 


University of Minnesota 


The problem of predicting students’ success in college upon the 
basis of data obtained prior to their admission to institutions of higher 
learning has engaged the attention of numerous investigators during 
the past ten years, resulting in the accumulation of a voluminous 
literature upon various phases of the subject. Instructors’ marks 
have been employed almost universally as the criterion of college suc- 
cess, frequently based upon the marks of only one or two semesters. 
The variables which have commonly been correlated with the criterion 
are scores on a psychological test and marks of high school instructors. 

The results of these investigations, although illuminating, have not 
yet disclosed a satisfactory method for predicting college success. 
A summary! of the literature reporting coefficients of correlation 
between psychological test scores and college marks shows a range 
from .10 to .75, the median being .44. The coefficients reported 
between high school and college marks? range from .15 to .80, the 
median falling at .53. When psychological test scores and high school 
marks are combined, the multiple R between them and college marks’ 
centers about .60. Measures for predicting college success whose 
correlations with the criterion tend to center within the range from 
.45 to .60 have only limited usefulness. While they are of value in 
indicating the probable performance of a group, the degree of error in 
predicting the success of individuals is so large that they may be used 
only with great caution. 

For several years a combination of the percentile rank computed 
from the score of an individual upon a psychological test and the 





1 Kinney, L. B.: A Summary of the Literature on the Use of Intelligence Tests 
in Colleges and Universities. Unpublished manuscript on file in Committee of 
Educational Research, University of Minnesota, 1931. 

? Douglass, Harl R.: ‘‘ Relation Between High School Preparation and Certain 
Other Factors to Academic Success at the University of Oregon.” University 
of Oregon Publications, Education Series, Vol. III, No. 2, pp. 15-16, Eugene, Uni- 
versity of Oregon Press, 1931. 

Symonds, P. M.: Measurement in Secondary Education. Ch. 20, New York, 
The Macmillan Company, 1928. 
> Douglass, Harl R.: Op. cit., p. 48, Table 24, and p. 50, Table 25. 
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percentile rank in the high school graduating class has been used at the 
University of Minnesota as a basis for predicting college success. 
This measure is called the College Aptitude Rating. It has been used 
as a basis upon which to offer advice to prospective college entrants 
and their parents concerning the probable success of these entrants 
in college. Students who rank in the lower levels of the College 
Aptitude Rating are presumed not to be good college risks. The exact 
points at which the critical lines are drawn, below which a prospective 
student is thought not to be of college calibre, are not perfectly clear 
and vary in different colleges. The twenty-fifth percentile has been 
suggested in one college of the University as the threshold of success 
for women, and the thirty-fifth percentile for men. But even in this 
college during the past ten years there has been some variation in the 
location of the critical point. 

The scattergram has been the chief instrument used to portray 
the relationship between the College Aptitude Rating and college 
success. Few coefficients of correlation have been published; those 
which have appeared seem to center about .69. Although this coeffi- 
cient is somewhat higher than the central tendency found by other 
investigators, its magnitude indicates that the instrument does not 
enable accurate prediction to be made concerning the success of 
individuals, as Johnston' has pointed out in a number of places. 

In view of the moderate correlation which exists between such 
measures and college success as determined by instructors’ marks, 
it has seemed desirable to undertake a case study of the college and 
subsequent history of students who at entrance into the University 
ranked in the lower levels of the College Aptitude Rating. These 
students were selected because they are the ones who, on the basis of 
the rating, would presumedly be least successful in their college careers. 
The results of such an investigation are being reported here. 

In 1924 the Committee of Seven, appointed by the President of the 
University of Minnesota to study the problems of articulation between 
the secondary schools of Minnesota and the University, collected data 
concerning four hundred and sixteen graduates of Minneapolis and 
St. Paul high schools who entered the freshman class of the College 
of Science, Literature and Arts in October of that year. These stu- 


1 Johnston, J. B.: ‘‘Student Aptitude and Prediction of Student Scholarship.” 


Bulletin of University of Minnesota, Vol. XXX, No. 75, Nov. 8, 1927, pp. 9-10. 
Who Should Go to College? Minneapolis, The University of Minnesota 








Press, 1930, p. 11. 
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dents were selected by chance and comprised fifty per cent of the 
graduates of these high schools who entered the college that fall. 

This study presents the complete records in higher education, 
together with the vocational history after leaving college, of the one 
hundred nineteen students from this group who were in the lowest 
forty per cent upon the College Aptitude Rating. The fortieth per- 
centile was chosen as the critical point below which to select subjects 
for this study because published data! show that approximately only 
6.7 per cent of the students falling below this point are satisfactory 
college risks. 

The first step in this study was to compile the college educational 
history of these students. From the office of the Registrar of the 
University of Minnesota were obtained the complete records of all 
course work taken by these students in any college within the univer- 
sity from the date of entrance until the opening of the winter quarter 
1932-1933, a period of over eight years. At that time, four of this 
group were still registered as students in the University. In the 
cases of students who transferred from the University of Minnesota 
to other colleges, a record of the dates of attendance, amount of work 
completed, and degree conferred, if any, was obtained from the 
registrar of the specific institution enrolling the student. No detailed 
record of courses taken or marks earned was collected for such cases, 
nor were such data included in the compilation of the scholastic honor 
point averages used in this study as one measure of college success. 

The vocational records of all the students subsequent to their 
college careers were obtained by personal interview with the subject 
or, when this method was not possible, by interview with the employer 
or members of the immediate families of the subjects. In five cases, 
individuals who had removed from the Twin Cities furnished the 
information by correspondence, filling out the form employed in 
recording information obtained by interviews. Data available in the 
University Alumni Office, the Bureau of Recommendations, and 
listings in the city directory were used to supplement individual 
reports. Information concerning positions held was, for most subjects, 
obtained in the form of a continuous history from the time they left 
the University until the fall of 1932. In this report two cross sections 
are given in full, one for the year 1929 and the other for 1932; these 
show the occupational status of each individual at those periods. 





1 Calculated from data in Johnston, J. B.: Who Should Go to College? Minneap- 
olis, The University of Minnesota Press, 1930, Table II, p. 21. 
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Since the primary purpose of such measures as the College Apti- 
tude Rating is to predict success in higher education, attention is 
directed first to the accomplishments of these students in their college 
work. In Table I is offered a distribution of all the subjects included 
in this study according to their College Aptitude Rating and also the 
number who earned a college degree. As would be expected, in view 
of the correlation of the rating with college marks, many of these 
students did not receive any degree. But it is significant that forty- 
three per cent of those in the third decile, a very low rating, received a 
degree from some college and that twenty per cent in the second decile, 
which is below any critical point that has been suggested as a basis 
for prediction, were awarded college degrees. Over twenty-four 
per cent of the total group have received a degree from some college. 


TABLE I.—DuistrRIsvuTION AccorRDING TO CoLLEGE APTITUDE RaTInG or ALL 
SUBJECTS AND OF SuBJEcTs EARNING DEGREES! 








College Apti- We. walddente No. earning | No. receiving | Per cent earn- 
tude Rating f degrees! no degree ing degrees 
31-40 36 11 25 30.55 
21-30 32 14 18 43.75 
11-20 20 4 16 20 .00 
1-10 31 0 31 0.0 

MSs ¢ ekeee 119 29 90 24.37 

















1 Three students who transferred from the University of Minnesota received 
degrees from other institutions. Two of these were in the third and one in the 
fourth decile of the College Aptitude Rating. 


Thirty-six college degrees were earned by students in this group 
(Table II). The apparent discrepancy between the number of 
students receiving degrees (twenty-nine) and the number of degrees 
awarded (thirty-six) is explained by the fact that several students 
earned two or more degrees. It is of interest to note that seven colleges 
in the University of Minnesota awarded thirty-three of these degrees, 
twenty of them being earned in professional colleges. Two institu- 
tions outside the University of Minnesota bestowed three degrees upon 
students in this group. 

Another illuminating aspect of this study is the number of degrees 
received by different individuals (Table III). One individual in the 
fourth decile has earned two degrees and one in the second, third, and 
fourth deciles respectively, has each been awarded three degrees. 
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Tas_LE II.—Sovurce or DEGREES EARNED AND THE NUMBER AWARDED BY Eacu 
COLLEGE 
CoLLeGE AWARDING NuMBER OF DEGREES 
DEGREES AWARDED 
Colleges within the University of Minnesota. 
Seience, Literature and Arts.......................5002- 
ee SA ee RA se MG ALAC ROE Lae OMAR UNE 
2 5's Liat) eugene oa" td oreo. c clade cna dads ba aaa eee 
sachs ok be AEE ab oak ois opte ble Coa sae aaee 
NS i coda ow Be Raat ep sek Sen's b 4 oe kar ad nse Oa 
a cas Lh aa eh a i ee ee ae 
Other colleges. 
EE ee i. ae Saavaes Fe bee kb cob ek ner ae 
RE Da oc Jee ido sa pve wre cele'y Shee ww Made 'wlew-d ig we be 
Total.. 36 


The remaining twenty-five individuals have each received one degree. 
It should be noted that of the four persons in the second decile, which 
is below any suggested threshold of college success, three have earned 
one degree, and one has earned three degrees, all of them having been 
awarded by some college within the University of Minnesota. 

If earning a college degree is an evidence of college success, it is 
apparent that many individuals who stand low upon the College 
Aptitude Rating are satisfactory college material. The facts shown by 
these tables throw into clear relief the weakness of measures for pre- 
dicting college success which have only a moderate correlation with 
college marks. Such measures are of value in determining group 
tendencies but the degree of error in them is so large that they can 
not be used by themselves to predict the success of individuals. Even 
the best trained and most competent counselors, using all the informa- 
tion available, are not able to predict accurately the performance of 
individual students. The probability of error or of misuse of such 
measures as a means of advising prospective college entrants is great 
when they are placed in the hands of persons ignorant of their limita- 
tions or of the need for supplementary information. This danger 
appears to be greater when a critical point is suggested below which 
it is presumed that an individual will not succeed in college. In this 
connection it is significant to note that three of the twelve women and 
eleven of the fourteen men who received one or more degrees from the 
University of Minnesota stood below any critical point of success 
which has been suggested for the respective sexes. 
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TaBLe III.—DistTrisvTion OF THE NUMBER OF DEGREES EARNED WITH RESPECT 
to CoLLeGE ApTiTupE RaTING OF THE RECIPIENTS 





5 Number of individuals earning 













































































: College Apti- Total number 
‘ tude Rating One degree Two degrees | Three degrees degrees 
ie 31-40 9 1 1 14 
s 21-30 13 0 1 16 : 
11-20 3 0 1 6 | 
1-10 0 0 0 0 

PR Sud wk «a 25 1 3 36 , 
| ; 
Another measure of college success is the quantity of course work ‘ 
| in which an individual earns college credit. The relation between the , 
College Aptitude Rating and the number of quarter hours credit earned { 
1 
} TasLeE IV.—QvuartTEeR Hours oF Work COMPLETED AND NUMBER OF DEGREES , 

EARNED BY ENTRANTS INTO THE COLLEGE OF §.L.A. IN 1924 From TWIN 
City Hieo Scuoots WHo WERE IN THE Lowest Forty Per CENT ON ] 

THE COLLEGE APTITUDE RATING 

Q 
Amount of work in quarter hours gp md of c 
College Aptitude . Total c 
Rating | n 
0 | 20/4060) 80/100) 120) 140/160) 180: up; 1 213 7 
Nn 
40 ee ae A 2  — 3 t 
37 1) 4.. | 2 1 5 (3) | (1)} (1)] 18 h 
33 * FAR 1/2 EE! ORS MOR h 
30 Py ie pee 1} 1/1 3 ee ee 7 t 
28 2| 2/}..; 1) 112] 2 1 6 ee); ae ' 

25 7) Pee SF Fe oe 5 (3)} ...; ()} 1 
22 6} 3} 1) 1). ..| 1 2 Dev col vest oe A 
19 6} 5} 1 J Pa. Sa ee ie t] 
16 it BS BBS S Bos re 1 (1) 8 st 
13 f° MOE Be ae, Oe Oe ie 2 (2) bel ee b 

10 1} lj..} 1 £2 3 Gear oe ren, ee 6 
7  £ OC BS BSG Ss Be ie oe “a Bare Peek eee 3 CE 
4 i Bie ia 5 fr 
1 fe ee SG a ae - re eee, eee 2 _ 
Total............/27/28) 4 7/6 9|6)4/)1 27 (22) | (1) | (3) | 119 D 
1 Note: One student earned over one hundred eighty credit hours but had M 


insufficient honor points to be awarded a degree. 
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at the University of Minnesota is set forth in Table V. Although 
sixteen men and eleven women received no credit for University 
courses, the discrepancy between prediction and performance stands 
out clearly in these tables. Many persons in the lowest levels on the 
College Aptitude Rating earned a large number of quarter hours of 
credit. A total of forty-seven individuals earned one hundred or 
more quarter hours credit, an amount sufficient to meet the university 
requirements for classification as juniors, provided other requirements 
are satisfactory. 

Since quality as well as quantity of work is a requisite for gradua- 
tion from the University of Minnesota, the data concerning the scholas- 
tic achievements of these students as measured by marks are of interest. 
The honor point average was used as the measure of quality of work. 
This average is obtained by assigning a numerical value to each of 
the marks in the marking scale! and computing the average of this 
value for all the marks received by a student. Table V shows the 
relationship between honor point average and the College Aptitude 
Rating. 

The honor point average required for graduation is not uniform 
among the colleges in the University; consequently the meaning 
of the honor point average earned by a student varies according to the 


college in which he is enrolled. It is clear from these tables that . 


many students, both men and women, did college work of a low quality. 
The same fact has been shown to be true, however, for persons of a 
much higher rating.? The significant point is the discrepancy between 
the prediction according to the College Aptitude Rating and the 
honor point average earned. Eighty-six students (Table V) earned 
honor point averages ranging in value from the lowest passing mark, D, 
to an average better than B. 

These comparisons between the rank of students upon the College 
Aptitude Rating and their accomplishment in college all point toward 
the amount of error in the prediction of the performance of individual 
students. There are probably a number of causes which are responsi- 
ble for the discrepancies which have been found. Just what these 
causes are, or how important any one of them is, cannot be determined 
from the data at hand. Some hypotheses can be suggested, however, 





1 The weighted value assigned each mark is as follows: A = 3, B = 2, C = 1, 
D=0,F = —-1. 

2 Johnston, J. B.: Who Should Go to College? Minneapolis, University of 
Minnesota Press, 1930, pp. 20-21. 
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concerning factors which may contribute to the variations noted. The 
marks of college instructors, which form the criterion of college success, 
may be so unreliable as to limit the validity of the criterion itself. 
Bohan! has shown that the variability in college marks, not only 
between departments but also between instructors within a depart- 
ment, is so great that a student’s success may be determined by the 
department in which he elects courses or by the instructors under 
whose tutelage he sits. The effect of such unreliability in marks 
upon a criterion based upon them is increased when the marks of a 
single year, such as the freshman year, are used in making the criterion. 
Other factors within the university, such as the nature of the courses 
offered or the methods of instruction, may likewise affect the students’ 
achievement and account for variations between prediction and 
performance. 

Another source of the discrepancies which have been found may 
lie in the data from which the College Aptitude Rating is derived. 
Lack of reliability in either of the two variables composing the rating, 
the college ability test, or high school marks, would affect the accuracy 
of the ratings and the predictions based upon them. The presence or 
absence of characteristics or traits which are important for college 
success but which are not measured by the College Aptitude Rating, 
the nature of the students’ educational background, the effect of his 
environment upon his performance, and many other factors may be 
additional sources of error. The purpose here is not to attempt to 
analyze the causes of the discrepancies which are found between 
prediction and success, but rather to point out that, whatever may 
be the causes contributing to them, their existence is evidence of 
inability to predict successfully the performance of individuals in 
college upon data obtained by the measure used. 

Another measure of the ability and achievement of individuals is 
the vocation they enter and their success therein. In Table VI there 
is summarized the occupational status of these students as it was in 
1929 and again in 1932. From this table it is evident that considerable 
numbers of these students have entered and are holding positions 
which require a rather extended period of professional and academic 
training. A surprising fact is that only five of these individuals 
were unemployed in 1932. Apparently these persons who are found 





1 Bohan, John E.: Students’ Marks in College Courses. Minneapolis, Univer- 
sity of Minnesota Press, 1931. 











ewe 


— — en eee or 
ee See Be. 2. ES ptt ere 


a ey 


. 
fe —_- 
Pri Ani sce wie ESET spe tates canatinite et patina ee a eae 

















456 The Journal of Educational Psychology 


in the lower forty per cent upon the College Aptitude Rating have 
abilities which make them preferred in employment even in these 
troubled times. 

An attempt is made in Table VII to make a rough evaluation of 
the occupational levels of the employed individuals. The scale! 
used is a refinement of Tausig’s scale of occupations, using as a base 
the proportions of persons employed in each occupation as reported in 
the 1920 census. In evaluating the occupational status of this group 
thirty-one persons were omitted, including married women not 
employed outside their homes, students, and others for whom the 
information available was not sufficient to permit a classification. 
Columns 3 and 4 of Table VII show the actual and the cumulative 
TaBLe VI.—OccuPpaTIONAL DISTRIBUTION OF ONE HUNDRED NINETEEN STUDENTS 


IN 1929 anp 1932 Wuo ENTERED THE COLLEGE or S.L.A. In 1924 Wuo 
WERE IN LowERr Forty Per Cent or Co.tuiece Aptirupe Ratine 














Occupation 1929 | 1932 

a eRe ee a oe aa denken Gan. Kime 4 
Be a re ke ae da by oul renhe sieehewnn 8 6 
AA Sie yh te RISE eel ae IS WRT Rete a a) ries ey: 3 
I tae eatin Ce SL Tee dbckbe chs ede ee uewesiel wen 2 
re as 6s ns nite hace beans bulbs sede Whey ws ted 1 1 
RN MEG as Lagi bh-ct hela ce gtnoee RALeuabne see awa 1 
i a Se te aw wil ad 2 1 
tee Oia ele chs A oes be ceo wk ps ec e-w eae 2 3 
I 6 06 ones eo os 2 2 
Re oll seeediepedn eke é 27 5 
Own business or corporation executive............. 8 14 
SOR Risa IRE Ae Ze na enn 25 24 
I er ara hae ale tie WE cae Nig 23 11 
SEE TEI CLE LAGI ED LO 8 22 
ee ee ce deeb ede Oe 8 12 
so las nia ee wice eee be ve US 2 5 
NN nt wu ki diwie ea werd baee.we 2 2 
a eee a kn eee te ek Aki ee 119 119 





percentage of the subjects of this study who were engaged in an 
occupation classified in each category of this scale, and columns 





1 Goodenough, F. L. and Anderson, J. E.: Experimental Child Study. New 
York, The Century Company, 1931, pp. 234ff. 
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5 and 6 the comparable percentages of persons engaged in occupations 
of these classes according to the 1920 census. 


TaBLeE VII.—OccupaTIoNnaL Status in 1932 or E1iautTy-g1auHt Cases CLASSIFIED 
In TERMS OF THE MINNESOTA OCCUPATIONAL SCALE! 














7 . Cumulative 

Class Number} Per cent Cumulative | Per cent in per cent, 
per cent 1920 census 

1920 census 
I 19 21.6 21.6 2.5 2.5 
II 21 23.9 45.5 4.7 7.2 
III 28 31.8 77.3 14.4 21.6 
IV 1 1.1 78.4 18.8 40.4 
V 19 21.6 100.0 27.4 67.8 
VI 0 TS RR 13.3 81.1 
VII 0 > Ss Saate Now 18.9 100.0 
WR ink ctieavs 88 100.0 100.0 100.0 100.0 














1 For a description of this Scale see Goodenough, F. L. and Anderson, J. E.: 
Experimental Child Study. New York, The Century Company, 1931, Appendix A. 


Although the fact that an individual is engaged in a particular 
occupation is not an adequate criterion of his success in that field, 
his ability may be judged somewhat by the fact that he holds such a 
position. It is apparent from this table that the majority of the 
individuals in this group hold positions commonly thought to require 
ability distinctly above average. None of them are employed in 
occupations classified in the two lowest categories. Over twenty-one 
per cent are engaged in occupations of the highest order as compared to 
two and one-half per cent so employed in the United States, and 
seventy-seven per cent hold positions classified in the highest three 
levels as compared to about twenty-two per cent in the United States 
in 1920. 

These facts appear to indicate that the subjects of this study possess 
abilities of a superior character, from a vocational standpoint at least, 
when compared to the general population. Beyond this, their meaning 
is not clear, and the implications which may be drawn will vary accord- 
ing to the point of view of the interpreter. It may be that many of 
the abilities which enable these students to obtain and hold business 
and professional positions of a high order are not of value for success 
in colleges of the present day. It may be that colleges do not offer 
curricula or courses adapted to meet the needs, interests, or capacities 
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of these students. It may even be argued by some that these students 
have obtained from college the services which would be valuable to 
them. Whatever the conclusions drawn, the data are evidence that 
these students have abilities that enable them to obtain employment 
in occupations of a much higher level than the general population. 

The purpose in initiating this study was to follow a group of stu- 
dents of low College Aptitude Rating through the university and for a 
period of years in employment in order to discover their achievement 
in college and in the occupations they entered. The results seem to 
emphasize the inability to predict college success for individuals upon 
the basis of measures which have only a moderately high correlation 
with instructors’ marks. While it may be true that there are individ- 
uals in this group who may not have profited from such courses or 
curricula as are commonly offered in college, the persons who were 
able to complete prerequisites for professional colleges, to earn one or 
more degrees, or who, upon leaving college were able to find and hold, 
even in these times of unemployment, positions of responsibility and of 
relatively high status are too numerous to be ignored. The achieve- 
ments of these students who rank low on the College Aptitude Rating 
indicate clearly the failure of such measures to predict the success of 
individual students. 
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LEARNING FROM LECTURES VS. LEARNING FROM 
READINGS 


STEPHEN M. COREY 
University of Nebraska 


Numerous investigations have been made recently of the compara- 
tive value of different teaching methods employed in institutions of 
higher learning. The attitude of college administrators regarding 
instruction seems to be changing from ‘‘laissez faire’? to something a 
bit more provocative to student learning. The thirty-first yearbook 
of the National Society for the Study of Education, entitled ‘Changes 
and Experiments in Liberal Arts Education” (1932) was encouraging 
in its reports of the amount of attention being directed toward cur- 
ricular and instructional problems. Even as early as 1928, Carter 
V. Good was able to accumulate some two hundred forty-five refer- 
ences ‘‘On College Teaching with Special Emphasis on Methods of 
Teaching.’’! 

Relatively few of these studies, however, have dealt with the 
lecture, which is one of the unique aspects of collegiate instruction, 
and which from certain points of view would seem to be receiving 
increased emphasis as an instructional method. For example the 
widely accepted Minnesota studies? of the effect of class size upon 
teaching and learning have served to create a more tolerant attitude 
toward lecturing, particularly in large institutions experiencing 
financial worries. Another development which has had much the 
same effect is the unprecedented popularity of survey and orientation 
courses in which the favored method of teaching seems to be by 
lecturing. The wide acceptance of these courses is indicated by the 
fact that in 1931 they were offered in some seventy-eight* representa- 
tive universities and colleges—this curricular innovation being second 
only in popularity to ‘‘honors courses.”’ 





1 Good, Carter V.: Bibliography on College Teaching with Special Emphasis on 
Methods of Teaching—Yearbook. Number XVI, 1928, National Society of College 
Teachers of Education, pp. 66-96. 

? Hudelson, Earl: Class Size at the College Level, University of Minnesota Press, 
Minneapolis, Minn., 1928, p. 299. 

* Whipple, Guy Montrose (Ed): Changes and Experiments in Liberal Arts Educa- 
tion—Yearbook. Number XXXI. National Society for the Study of Education, 
1932, pp. 26-40. 
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On the other hand, not a few of the more liberal changes which may 
be observed in instructional practices at the college level would indicate 
dissatisfaction with the lecture method. Every increase in extension 
work by correspondence is a direct implication not only that the 
lecturer is a superfluity but that teachers in general are not essential 
for learning. The trend toward reading for honors and tutorial 
instruction! might readily be considered an index of this changing 
emphasis as indeed might all attempts to adapt teaching to individual 
students. Such adaptation is precluded by the very nature of the 
lecture, where either the rate of understanding is assumed to be 
constant for all students concerned, an absurd belief, or equally bad, 
no thought is given to the matter. 

From a certain point of view, the lecture method of teaching may 
be considered as an anachronism. It developed and thrived, although 
never without criticism, during the early university period, largely 
because of the scarcity of books and manuscripts available for student 
use. The university faculties then knew many things not generally 
available, and the only feasible method for them to.disseminate their 
learning was by telling or lecturing. Instructors using this method 
became famous, not necessarily because of the charm of their manner 
or the stimulus they provided as teachers, but chiefly on account of 
the valuable subject-matter they had to offer—because of the encyclo- 
pedic range of their scholarship. Historically, those persons who made 
reputations as teachers as opposed to wells of information have in 
most cases been ‘“‘dialecticians” in the best sense of the word— 
conversationalists, discussion group leaders. They have made 
their greatest contributions along the line of stimulating others to 
learn rather than of presenting the fruits of their own learning to be 
memorized. 

With modern advances in printing and multigraphing, the historical 
argument for the lecture method of teaching has been weakened if not 
destroyed. Information in permanent form has accumulated so 
rapidly and is so readily available that university students are no 
longer dependent upon a faculty for intellectual nourishment in the 
same sense as they are for stimulation and guidance. Particularly 
is this true unless it can be shown that student learning is noticeably 
enhanced by the presentation of information in lecture form as 





1 Hanford, A. C., Dean of Harvard College writes ‘‘The successful tutor is one 
who says very little himself, because if he does otherwise the tutorial method 
degenerates (italics mine) into a lecture and the tutor into a coach.” Ibid, p. 177. 
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opposed to the same materials presented in mimeographed or printed 
form and read by the students. If the learning is equal under the 
two conditions, it would seem that the lecture might well receive 
much less emphasis, thereby serving to release the instructor from 
this responsibility and making it possible for him to assume others of 
greater educational significance. Most university teachers, unless 
too thoroughly soaked in the medieval traditions, would admit that 
they have at present insufficient time to devote to the problems of 
individual students, and after all, these should probably be the major 
concern of the undergraduate faculty. 


INVESTIGATIONS OF THE LECTURE METHOD 


The lecture method, as it is ordinarily understood, consists in an 
instructor telling his students about a particular field of knowledge. 
Pedagogically, it is distinguished from other teaching techniques such 
as the oral quiz, the discussion, or the project in which the activity 
of the students—the learners—is more apparent. Many who lecture 
do so with sufficient informality to permit of rather frequent inter- 
ruptions and interrogations,' but such, of course, is not the spirit of 
that method of teaching. The typical, formal, lecture is well illus- 
trated in the field of history, wherein the instructor frequently puts in 
two full hours a week telling a large class about the course of human 
events, with one period set aside for clarification or relief, and called 
a “‘quiz” section. 

Most of the evidence regarding teaching by the lecture method is 
of the anecdotal or testimonial type. It has been said that the 
lecturer serves to animate his subject-matter, that he can intersperse 
his recitation of facts with sparkling wit and interesting current 
illustrations. Others claim that with the lecture method the students 
will receive the very latest information at a great economy of time, 
inasmuch as the instructor does the research work, winnowing the 
wheat from the chaff, and the students are spared this labor. The 
most frequently suggested attribute of the ‘‘telling”” method, however, 
is that it provides for personal contact between students and teacher 
which every one admits should be beneficial. 





1See Bane, Charles L.: The Lecture in College Teaching, Richard G. Badger, 
Boston, 1931, Chapter IV. ‘‘Hints from Some Master Lecturers.” To the 
present writer the most interesting thing about many of these ‘‘master lecturers” 
—Abelard, Adam Smith, Benjamin Silliman, Francis Lieber et al—was that their 
success depended largely upon frequent departure from the formal lecture to a 
discussion type of teaching. 
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’ It is apparent that these statements, while conceivably true, do 
not constitute, a priori, any defense of the lecture method of teaching 
as such. Were an instructor to mimeograph his lectures at the 
beginning of each semester he would still be saving students’ time, 
his materials might still be up-to-date, and if he saw fit, they might 
contain as much humor as the subject or the teacher could generate. 
The personal contact between instructor and students which the 
lecture provides in some degree, could be made even more intimate and 
educative were lecture time devoted to discussions or consultations. 

Inasmuch as lecturing, as opposed to the discussion or quiz, is 
primarily a means of presenting information, the other alternative 
would seem to be to provide opportunities for students to acquire 
this same information through reading. In terms of the methodology 
involved the lecture and the discussion methods are not strictly 
comparable. The former consists in an attempt to present facts with 
the tacit assumption that they will be reacted to by the students and 
hence mastered, whereas in the discussion method the activity of the 
teacher is lessened and that of the students increased. From this 
point of view reading and listening to lectures, other than with respect 
to the difference in sense avenues involved, are quite similar psy- 
chologically and functionally. 

The advantage in the reading method which immediately presents 
itself is that reading rate can be adapted to the ability of the individual 
student to comprehend, while the rate at which ideas are dispensed 
in the lecture is the same for all listeners in the group. Other argu- 
ments, such as the permanence of the written record, and the possi- 
bility of note making without attention oscillation and its attendant 
risks, are also obvious. But these reasons are logical and not empir- 
ical, and the real issue, which must be decided experimentally before 
the lecture method may be considered justified, is whether students 
actually learn more from the lecture than they would from identically 

worded materials in printed form, studied for the same length of time. 
Most of the experimental investigations of the lecture method of 
teaching have consisted of attempts to compare it with the demonstra- 
tion and individual laboratory methods of teaching the sciences.' In 
most instances, the lecture, unsullied by discussions or demonstra- 
tions, has not proved particularly efficient as an instructional method. 

1 Payne, V. F.: ‘‘Lecture-demonstration and Individual Laboratory Methods 


Compared.” J. Chem. Educ. Vol. IX, pp. 932-939; 1097-1102; 1277-1294; May, 
June, July, 1932. 
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Another group of studies compared the learning resulting from 
lectures with that resulting from classroom discussions. Bane! has 
summarized these findings as follows: (1) ‘‘The lecture and class 
discussion methods of college teaching appear about equally effective 
in the immediate recall of content material, and (2) the class discussion 
method is more effective than the lecture in the delayed recall of 
subject-matter.” It should be kept in mind that the subject-matter 
tests employed to measure the retention in these comparative studies 
were so constructed as to measure factual materials only and were 
thereby better adapted to the evaluation of lectures than discussions. 
Morris and Douglass? found the problem method slightly superior 


to the lecture method in the teaching of Economics, as did Tuttle ~- 


and Douglass* for Psychology. Miss Scheideman‘ reported that 
students improved about the same amount whether the lecture- 
conference or a more highly individualized method was employed. 
That lectures can be made better is established. At Ohio State 
University a technique was developed whereby student reactions to 
both the content and the presentation of the lecture were markedly 
improved.® Israeli® has determined those aspects of the lecture to 
which the students react most satisfactorily, and the repetition of 
text materials, a significant part of most lectures, was generally con- 
sidered to be of little value or interest. Pressey,’ in an ‘‘experimental”’ 
class in which an attempt was made “‘to break away as completely as 
possible from both lecture and recitation methods and develop a 


socialized procedure,’’ felt that the innovation was a marked success. — 


Evidence of this was advanced in the form of student opinion as well 





1 Bane: Op. cit. p. 274. This reference presents a good summary of the status 
of the lecture method in college practice although the writer felt that most of the 
interpretations unduly rationalized criticisms of the lecture. 

? Douglass, Harl A. et al.: ‘Controlled experimentation in the study of methods 
of college teaching.” Univeristy of Oregon Publications, Vol. 1, No. 7, 1929, 
pp. 285-292. 

* Ibid., pp. 293-299. 

‘Scheideman, Norma V.: ‘‘A comparison of Two Methods of College Teach- 
ing.” Sch. and Soc., 25, 1928, 672-674. 

5 Cowley, H. W.: ‘‘Evaluating Group Lecture Courses.”” Educ. Res. Bull. 12, 
1933, pp. 1926-1928. 

‘Israeli, Nathan: ‘‘Students’ Ratings of Lectures.” J. Educ. Psychol. 24, 
1933, pp. 236-239. 

7 Pressey, S. L. et al.: Research Adventures in University Teaching. Public 
School Publishing Company, Bloomington, Ill., 1927, pp. 134-139. 
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as the increased per cent of students electing to continue work in the 
field. 

One generalization which would inevitably occur to any one familiar 
with the literature comparing the lecture with other teaching methods 
is that one instructional technique is about as good as another. There 
would be exceptions to this conclusion, of course, but ever so many of 
the results were either negative or so inconclusive as to require rather 
refined statistical manipulations to bring out any significant differ- 
ences. Either the essential factor in good instruction is quite inde- 
pendent of and different from what we have designated as the ‘‘ method 
of teaching” employed, or the latter is unable to manifest itself in 
the presence of so many uncontrolled variables. The latter possibility 
suggests the need of more elaborate and better controlled experiments. 

The only reports the writer has been able to find of the relative 
efficiency of instruction by reading versus instruction by lecture have 
been made by Greene! and Thompson.? As was stated above, a 
comparison of these two teaching methods is most reasonable, inas- 
much as each is primarily a means of engendering familiarity with 
information, and any subject-matter which can be presented in the 
formal lecture can also be rendered in writing and given students for 
study. 

Greene employed a total of two hundred seventy-six men and 
subjected them to the following procedures. Lectures were given 
to one-half of each class and mimeographed sheets covering the same 
general topic to the other half. These materials were not identical. 
Motivation was provided by the “this will count on your grade” 
method. The students were encouraged to take notes as usual, and 
their retention was measured by the scores made on delayed and 
immediate recall and recognition tests. The comparisons made of 
the retention of readings and lectures were rather consistently nega- 
_ tive. There was no significant difference between the test scores for 

the entire group. There was some tendency, however, for those scoring 
in the highest quartile of the psychological test to do better on the 
reading than on the lecture materials. The notes on lectures were 
judged to be consistently more complete than those taken on the 





1 Greene, E. B.: ‘‘The Relative Effectiveness of Lectures and Individual Read- 
ings as Methods of College Teaching.”’ Genetic Psychol. Monog., 4, 1928. 

2? Thompson, Lorin A.: “‘A Report on a Note Taking Experiment at Ohio 
Wesleyan University.” Ohio College Association Bulletin, No. 77, Ohio State 
University, Columbus, Ohio (no date). 








a e.. oe Se Ae. Sl oe, ae eee 


a=) 


yr 


a- 
or 
ng 
he 
re 


ad- 


hio 
ate 





Learning from Lectures and Readings 465 


readings, although no objective test was made of their value to the 
students. 

Thompson was primarily interested in determining whether note 
making ability could be used as a means for predicting scholarship. 
He used three hundred nine freshman English students at Ohio 
Wesleyan University, and employed an objective measure of the 
value of notes, namely, the contribution they made to the student’s 
success on an examination. The subjects made notes for thirty-five 
minutes on dictated and read materials and then were permitted to 
use the notes in answering test questions. The examinations based on 
dictation were given the higher average marks, namely 38.80 + 11.84 
SD as opposed to 35.80 + 11.52 SD for the reading examination. This 
difference is statistically significant (diff. = 3.00 + .94 SD), but how 
much of it was due to the notes as distinct from the ability to under- 
stand lectures vs. readings, was a problem with which Thompson was 
not primarily concerned. 

The present investigation is in some respects a repetition of Greene’s 
with the following rather significant variations. 

1. The materials given in lecture and reading were not “judged 
to be equivalent” as were Greene’s, but they were made identical. 
The writer believes that evaluation of materials by instructors in 
terms of their difficulty for students is at best a matter of conjecture. 
To compare the retention of lecture and reading materials with the 
factor of difficulty uncontrolled is hazardous. 

2. No students were allowed to make notes during either lecture 
or reading. While it is admitted that to allow note making might 
have made the experimental situation more analogous to the classroom 
situation, such would have served to introduce an additional variable. 

3. The tests used in the present study for both lecture and reading 
were considerably more reliable than were Greene’s—his being approxi- 
mately +.50, the writer’s +.77. 


EXPERIMENTAL TECHNIQUE 


The subjects used by the writer were all freshmen in the Teachers 
College of the University of Nebraska. The experiment was con- 
ducted during that part of an orientation course in which particular 
attention was devoted to note making from both lectures and readings. 
This may have been of some value with regard to motivation and 
cooperation on the part of the subjects. The data were gathered 
in the course of regular class work and no impression was current 
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regarding the misuse of students for publication purposes. The 
general experimental set up involved the equated groups technique, 
the difference between the two groups being that one read the materials 
and the other listened to a memorized lecture, identically worded. 
Table I sums up the situation concerning the equation of the two 
groups. It seemed important to guarantee equality with respect to 
vocabulary, reading ability, and general psychological test scores, 
although the size of the groups would almost assure chance equation.! 


TABLE I.—CoMPARABILITY OF READING AND LEcTURE GROUPS 























Reading Lecture Difference 

Test and its 

standard 

Mean! SD Mean | SD duvlatin. 
id es cy che one hed 24.52 | 8.91 | 24.16 | 10.95 | .36 + 1.56 
ae ti Cass Cake hoe 27.94 | 8.34 | 28.59 | 9.72 | .65 +1.44 
TEE pare rere 90.22 | 32.10 |- 90.84 | 32.30 | .62 + 5.01 

Number of subjects............. 83 82 








Scores were available for all subjects upon the Ohio State Psychological 
Examination Form 17, which gives separate measures for vocabulary 
and reading. None of the differences between the two groups was 
significant—in fact the largest difference, that between the reading 
scores, was less than half of its standard deviation. The ‘‘lecture” 
group was slightly more heterogeneous with respect to all of the 
abilities measured. 

The investigation was planned as follows. A twenty-five hundred 
word lecture on “outlining” was prepared and mimeographed in 
quantity. The writer familiarized himself with this material and 
found that his delivery averaged very close to one hundred words per 
minute. In order to hold the time factor constant it was decided to 
allow those students who read the materials as many minutes for 
their study as were required for the lecture. Immediately after both 
the lecture and the reading period a test of immediate recall was 
administered. This same test was given again, without warning, two 
weeks later to measure retention over a longer interval. The test 
used was a semi-objective true false, completion, and short answer 
type with a reliability, computed odd against even items and corrected 
by the Spearman-Brown prophecy formula, of +.773 for the reading 


1 Corey, Stephen M.: ‘‘The Dependence upon Chance Factors in Equating 
Groups.”’ Amer. J. Psychol., 45, 1933, pp. 749-752. 
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materials and +.779 for the same test over the same subject-matter 
presented in the lecture. These reliabilities are sufficiently high to 
make group comparisons valid. 

The subjects were not permitted to make notes, but they were told 
that an immediate recall test would be given over the materials 
involved. No notes were made because of the fact that the reports 
of both Greene! and Thompson? seemed to imply that note making 
is more complete for lectures than for readings. In an independent 
study® the writer obtained just the opposite results, but in any event 
the inclusion of note making would have introduced another variable 
to complicate the problem with which the writer was chiefly concerned, 
namely, do students learn more from listening to a lecture or from 
reading identically worded materials for the same length of time. 
Were notes made, it would have been impossible to determine how 
much of any difference discovered was due to this factor, or to some 
more or less inherent difference between the learning resulting from 
listening to lectures or reading. 

After the tests had been given and scored the ‘‘lecture” and 
“reading” groups were compared with respect to the following 
considerations: 

1. Scores on the immediate recall and delayed (fourteen days) 
recall tests. 

2. Correlation between immediate recall reading and lecture tests 
and these factors: 

(a) Standardized reading test score. 

(b) Standardized vocabulary test score. 

(c) Psychological test score. 

3. Relative success of students in the first, last and middle two 
psychological test quartiles with respect to both immediate and 
delayed recall. 


RESULTS 


Table II represents the mean scores and the difference between 
them for the reading and lecture groups.pmmmediate and delayed 


recall. For immediate recall the readi#é’ group was superior to the ~ 


lecture group. Taking exactly the same test over identical materials, 
reacted to for the same length of time, reading resulted in somewhat 





1 Greene: Op. cit. 

* Thompson: Op. cit. 

* Cory, Stephen M.: “Making Notes from Lectures and Readings.” Journal 
of Educational Research, 1934 p. 27 (in press). 
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Taste II.—Comparison oF MEAN Scores or LECTURE AND READING GROUP WITH 
ReEesPEcT TO IMMEDIATE AND DELAYED (FourRTEEN Days) REcALL 











, Lect Chances in 
& wi one hundred 
that differ- 
Mean | SD | Mean | SD | Diff. | sD | .“™°e* 
significant 
Immediate recall. ..| 22.87 | 3.33 | 21.50 | 3.48| 1.37 | .43 100 
Delayed recall...... 17.22 | 4.08 | 17.12 | 3.77 .10 63 56 


























greater immediate retention than listening to a lecture. The differ- 
ence between the scores of the two groups on a surprise quiz given 
fourteen days after the presentation of the subject-matter again 
favored the reading group but was statistically insignificani. 

Table III represents the product-moment correlations between the 
reading or lecture results and measures of vocabulary, reading ability 
and intelligence. None of the correlations is corrected for attenua- 


Tasie III.—CorrRELATIONS BETWEEN LECTURE AND ReEapING RESULTS AND 
CERTAIN OTHER MEASURES 





Reading scores | Lecture scores 





REE EER ae Ener ng eae alge Pry .474 + .06 PE | .294 + .07 PE 
hn. os ss buy dekh cadaath ea kee .436 + .07 PE | .276 + .07 PE 
FE OEE LA OPEL A OPE OE .491 + .06 PE | .327 + .07 PE 











tion, but each is sufficiently large to indicate the existence of a sig- 


nificant degree of relationship. The correlations involving retention’ 


of materials read were consistently higher than those involving the 
retention of materials heard in lecture. This might be accounted 
for in the case of the correlation between examination scores based 
on materials read and standardized reading test results in terms of 
community of function. Somewhat the same point could be made 
with respect to the correlations between reading retention scores and 
vocabulary scores in that the latter were derived from a printed test. 

Table IV represents the scores made on lecture and reading tests 
by students in different psychological test quartiles. The superior 
students seemed to do relatively better on the reading materials but 
there was almost no compensatory tendency for the inferior ones to 
get any more out of the lecture. These results are apparently in 
harmony with Greene’s. 
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TaBLe IV.—Meawn Scores Per Psycuoiocica, Test QUARTILE FoR THE Two 
Groups on IMMEDIATE RECALL 

















Reading Lecture Chances in 
ova i: one hundred 
Quartile "wi “ene that differ- 
: ence is 
Mean| SD Mean/| SD | Diff.| SD tenificant 
Highest....| 24 | 24.00/ 2.65} 27 | 22.00/ 3.10/2.00/ .81 99 
Middle two; 42 | 22.70) 2.67/ 32 | 21.44] 3.46| 1.26] .79 86 
Lowest..... 16 | 18.94) 2.19] 23 |19.22|}3.62| .26}| .79 58 
































Table V represents similar findings for delayed recall. None of 4 


the differences is significant although there is a noticeable tendency 
for the superior students to do somewhat better on the reading and the 
inferior students somewhat better on the lecture. The groups repre- 


TasBLe V.—Meran Scores per Psycnoiocican Test QuaRTILE FoR THE Two 
Groups on De.tarep REcALL 



































| e 
Naa ieass Reading Num- Lecture . 
Quartile we “sia Difference 
Mean | SD Mean | SD 
) 24 | 19.78 |4.42| 27 | 19.00 | 3.59 .78 
Middle two...... 41 | 16.75 |3.33| 32 | 17.10 | 3.31 .35 
Lowest.......... 13 | 18.93 |2.24| 22 | 14.92 | 3.57 .99 





sented in both Table IV and Table V are not sufficiently large to 
permit of much generalization. 


CONCLUSIONS 


The following statements would appear to be justified in view of the 
technique employed in the present study: 

1. Immediate recall is better for materials students have read 
than for the same materials heard in lecture. 

2. The two types of presentation have no very significant effect 
upon delayed (fourteen days) recall. 

3. The scores on tests measuring retention of materials read are 
more closely related to standardized test results for reading, vocabulary 
and intelligence than are scores on tests measuring the retention of 
materials listened to in lecture. 

4. There is a tendency for students scoring in the highest psychologi- 
cal test quartile to do relatively better on reading than on lecture tests. 
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5. When students in the highest psychological test quartile of the 
reading group are compared with those in the highest quartile of the 
lecture group with respect to delayed recall, no significant differences 
appear. The same is true of other psychological test quartiles. 


DISCUSSION 


Admittedly, the conditions of the present study were not any too 
analogous to those surrounding learning as it is said to occur in the 
classroom. The length of time devoted to lecture and reading was 
less than half an hour—a small fraction of the total time spent on any 
one subject during the course of a semester. Furthermore, it is 
believed that motivation was at a higher level during the presentation 
of both types of materials than might be the case during the ordinary 
routine of class work. 

The ability to understand a lecture can undoubtedly be improved, 
just as can the ability to comprehend written materials. But, by 
far the greater part of the information acquired after college results 
from conversation (discussion method) or from reading, and it would 
appear to be at least questionable, in the light of the results of this 
and other studies, to insist that students master a relatively new 
technique as a college tool only. Particularly is this true when the 
value of this new tool appears to be at least no greater than that of 
another, reading, which is more commonly employed, and which 
inherently possesses greater possibilities for adaptation to individual 
differences. No virtue claimed for the lecture is intrinsically and 
necessarily a virtue of the lecture as such. Most of the greatest 
advantages can be retained by the multiprinting of lectures, and the 
one characteristic to which attention is invariably directed, namely, 
personal contact between instructor and student, can be rendered even 
more educative by the use of the discussion method. | 

It would seem that dispensing with the purely informative lecture 
might enable many institutions to improve their instructional facilities 
by devoting more time to individualized instruction at practically no 
increased cost and with a probable increase in student learning. As 
President Butler of Columbia University has recently written, ‘‘The 
lecture system as a means of communicating facts should have been 
dispensed with when the art of printing was invented.’”! 





1 Butler, Nicholas Murray: ‘‘Report of the President of Columbia University 
for 1933.” Columbia University Bulletin of Information, 34 series, No. 12, Dec. 23, 
1933, pp. 30. 
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THE EFFECT OF HYPNOSIS ON LEARNING TO SPELL 


W. H. GRAY 


Kansas State Teachers College, Emperia, Kansas 


It is the purpose of this investigation to determine whether spelling 
can be taught more readily under the influence of hypnosis than in 
the normal state. 

The subjects used were six male students of the Kansas State 
Teachers College of Emporia chosen on the basis of their college- 
entrance tests. In these tests students are ranked in deciles for each 
test taken and for all tests combined. All six subjects proved to be 
weak in spelling but ranked above the seventh decile in general 
performance. Deep hypnosis could be produced in each subject and 
each reacted well to post-hypnotic suggestions. 

The spelling material was obtained from the Buckingham Extension 
of the Ayres Spelling Scale. In this scale words are arranged in columns 
in order of difficulty. Words were dictated to the subject from the 
scale in order on each day of the experiment until from twenty to 
thirty misspellings were secured. Two lists of an equal number of 
words were compiled from these misspellings. The lists were kept 
equal in difficulty by alloting the misspellings alternately. One of 
these lists was termed the hypnotic list and the other the non-hypnotic 
list. Both lists were taught to the subject on the same day. This 
procedure was repeated on each succeeding day of the experiment until 
all words of the scale had been dictated. 

In presenting the hypnotic list, the subject was hypnotized, seated 
at a table and given pencil and paper. He was told how to spell 
each word and made to write it. At the same time the suggestion was 
given that in the future he would always spell the word in question 
in the way he had just written it. This was continued until all the 
words were taught. The whole list was then dictated and the subject 
made to write it. If an error occurred, the word was retaught, until 
finally the whole list could be written without an error. The subject 
was then informed that he would always spell this list of words cor- 
rectly and was awakened from hypnosis. 

The non-hypnotic list was presented in a similar manner except 
that no suggestions were made and no hypnosis induced. 

In order to obviate the effect of practice, the lists were taught in 
alternate order on succeeding days of the experiment. That is, on 
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one day the hypnotic list would be presented first and the next day 
the non-hypnotic list would be presented first. 

On succeeding days of the experiment, written tests were given 
on the words taught on the preceding day and the number of errors 
in each list recorded. A record was also kept of the words which 
were missed in each list. When a sufficient number of these had 
been collected, a re-presentation was made under hypnosis of those 
words missed in hypnotic lists and without hypnosis of those words 
missed in the non-hypnotic lists. 


RESULTS OF THE INVESTIGATION 


A summary of the data for the experiment is presented in Table I. 
The table is read as follows: Subject 1 in 16 trials was presented with 
214 words under hypnosis, on the testing of which 73 words or 34.11 
per cent were spelled incorrectly as compared with 214 words pre- 
sented without hypnosis, on the testing of which 81 words or 37.85 
per cent were spelled incorrectly, making a difference of 8 words or 
3.74 per cent in favor of learning under hypnosis. It will be noted 


TABLE I.—CoMPARISON OF Errors IN SPELLING AS TAUGHT UNDER HYPNOSIS AND 
wiTHout Hypnosis 














, : Difference 
Hypnotic Non-hypnotic So lela 
Subject Trials Num-| Per Num-| Per “ae 
Words) ber of |cent of| Words! ber of |cent of 5 
ber | cent 
errors | errors errors | errors 

1 16 | 214 73 | 34.11) 214 81 37.85) 8 3.74 
2 5 59 5 8.47) 59 2 3.391 —3 |—5.08 
3 7 82 23 | 28.05) 82 26 | 31.71 3 3.66 
4 4 47 9 | 19.15) 47 12 | 25.53) 3 6.39 
5 3 41 5 | 12.19) 41 11 24.39| 6 | 12.20 
6 ll 131 59 45.03) 131 53 40.45} -—6 |—4.58 
. ERROR ne 574 | 174 | 30.31) 574 | 185 | 32.23) 11 1.92 
































that the number of trials per subject required to learn the scale varies 
from 3 to 16. Subjects 1 and 6 required the greatest number of trials. 
Subject 5 required the least number. Subjects 2 and 6 learned some- 
what better without hypnosis while the remaining 4 learned better 
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under the influence of hypnosis. Subject 6, however, had a much 
higher percentage of errors in both lists than did Subject 2. The 
percentage of errors does not seem to be due to hypnosis or non- 
hypnosis but rather to individual differences because in every case 
where the errors are low in one list they are also low in the other list. 
A large number of trials apparently makes little difference one way or 
the other since Subject 1 shows a difference of 3.74 per cent in favor 
of hypnotic learning while Subject 6 shows a difference of 4.58 per 
cent in favor of non-hypnotic learning. The greater the number of 
trials any subject requires to learn the scale apparently the less 
difference there is between learning by hypnosis and without hypnosis. 
There does seem to be some relationship between the number of trials 
and the difference in percentage of errors. Subjects 4 and 5 required 
only 4 and 3 trials respectively to learn the unknown words of the scale 
and these subjects show the greatest differences between hypnotic 
and non-hypnotic learning. The total difference, however, is only 
1.92 per cent in favor of hypnotic learning. This is not sufficiently 
great to warrant a widespread use of hypnosis as a means of teaching 
spelling. The experiment is interesting, however, from the standpoint 
that learning does take place readily under the influence of hypnosis. 
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BOOK REVIEWS 


SHEPHERD Ivory Franz. Persons One and Three. New York: 
McGraw-Hill, 1933, pp. 188 + XV. 


With the death of Dr. Morton Prince and the popularizing of 
psychoanalytic concepts, there has been a lag of interest in multiple 
personality. The latest book about this topic is Dr. Franz’s Persons 
One and Three. Unlike the earlier works written by psychologists, 
it is largely on a descriptive, observational level. Intentionally and 
consistently, Dr. Franz avoids consideration of the forty-three year old 
patient discussed in his book as dissociating into Poultney, Polting 
and Poulting in terms of hypothetical neurological concepts or Freud- 
ian mechanisms. As Dr. Franz sees it, the difference between the 
so-called multiple personality and normal individual is one of degree. 
In normal individuals the forgetting periods are of little importance, 
and because of their great number we pay little attention to them. 
Divisions there are in all individuals, according to his thinking. The 
divisions of the personality discussed in this book are merely more 
definite, more important to the particular individual’s adjustments 
than is usually the case, and more spectacular. 

The ‘‘persons one and three’’ first came to the attention of Dr. 
Franz on a morning in December, 1929, when the Los Angeles police 
picked him up wandering in a dazed condition, unable to recognize his 
surroundings or to supply his name or give the date. Amnesias of 
the patient took place during and subsequent to his services in the 
World War. The fifteen years from April, 1915, to March 5, 1930, 
seemed to be completely blotted out. Dr. Franz describes him at that 
point as a man of forty-two with memories of experiences for only 
twenty-seven years, whose appearance did not correspond with his 
memories. Other dissociated periods are of comparatively very short 
duration. 

A large portion of the book is devoted to a description of the char- 
acter and relevant events. The rest is largely a description, sometimes 
given in great detail, of Dr. Franz’s experiences with the patient. 
Concerning the nature of the personality changes he observed and to 
some extent influenced, particularly of the ‘‘combined”’ personality, 
Dr. Franz makes very few generalizations. The more significant of 
these can be summarized as follows: The patient was a fluctuating 
character even before he experienced the more drastic episodes 
described in the book; every change observed was preceded by a period 
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in which he was under emotional strain—that is, he was worried finan- 
cially, irritated, fearful, or else over-elated. 

The integration of personalities described are brought about 
either by suggestions through direct questioning on the part of Dr. 
Franz that, when in one personality, he think about vivid memories of 
his other divisions, or else by accidental experiences which made him 
recall previous events. 

The methods of analysis and treatment that Dr. Franz used as 
well as the actual description of events in the life history of the char- 
acter are described in simple language that can be easily understood 
even by the non-professional reader. H. MELTZER. 

Psychological Service Center, St. Louis. 


H. L. Houitinewortu. Educational Psychology. New York. D. 
Appelton and Co., 1933, pp. XVI + 540. 


The Author might almost have called his book The Educational 
Psychology of Cue Reduction. After which let us hasten to add that 
this is practically the only criticism that can be made. The selection 
and arrangement of topics is excellent. The description and evalua- 
tion of much experimental work is quite masterly. The author shows 
exceptional balance in evaluating educational principles and practices 
in the light of the findings of psychology. His last two chapters are in 
fact an excellent introduction to educational philosophy. Though he 
shows the same or even greater breadth of grasp in psychological 
matters as in educational, he does not show the same balance in inter- 
pretation. Everything must conform to his theory. He states as a 
fact that all motives are irritants, and this statement is reiterated 
throughout the book. Native capacity is confidently analyzed into 
docility (ability to respond to reduced cues) and scope (ability to react 
to total situations rather than to single elements). Thus (p. 42), “‘We 
may, in fact, say that just as the feeble-minded are those who are 
constitutionally deficient in docility or learning ability, so the neurotic 
are those who are constitutionally lacking in scope or sagacity.’’ 
The reviewer has no particular objection to the cue-reduction theory 
as such. It is an interesting and valuable one. But it is not as yet 
at least completely established and generally accepted. The author 
presents it as though it were, and implies indirectly that everyone who 
disagrees with it is either stubborn or unable to understand it. On 
p. 75, for example, there is a thinly veiled implication that Thorndike 
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does not know the difference between ‘effect’ and ‘affect,’ or between a 
material consequence and a neural or hedonic consequence. 

The teacher who is ready to accept the cue-reduction theory and its 
correlates as the fundamental psychological fact will find this a tre- 
mendously valuable book. The one who is not will find difficulty in 
using it as a text in elementary classes. There can be no reconciliation 
between the viewpoint of the book and any other viewpoint. 

Epwarp E. Cureton. 
Alabama Polytechnic Institute. 


Rospert H. Gavuut. Criminology. New York: D. C. Heath and Co., 
1932, pp. 461 + IX. 


Many psychologists have made significant contributions to the 
study of delinquency and crime, but to date most text books in crime 
have been written by sociologists and criminologists. Criminology 
is written by a psychologist who also happens to be editor-in-chief of 
the Journal of Criminal Law and Criminology, Dr. Robert H. Gault. 
The chief characteristic of the book is the psychological approach to a 
consideration of a criminal personality. In fact, more than two-thirds 
of the book, fifteen out of twenty-one chapters, are devoted to a 
consideration of the various aspects of a criminal personality. Some 
of the aspects included are intelligence, drives, dissociation, psychop- 
athy, epilepsy, social attitudes and heredity. The general outlook 
of the author towards the criminal is suggested in the following 
selected quotations from Chapter 1, wherein a number of clinical 
histories of delinquents and criminals are included: 

“Criminals,” we are told, ‘‘are human beings much like the rest of 
us. They move about from place to place: They play and work more 
or less; they laugh, mourn and are otherwise moved emotionally 
as we are; they form personal attachments to persons, things, and 
places as we do; just as we are, so are they eager for the approval of 
those with whom they associate, and are cast down when they fail to 
secure it. They are ambitious ‘for a place in the sun’: the ‘sun’ 
being the circle of those who are, in general, seeking the same type 
of satisfactions that they desire for themselves. They think, learn, 
and forget as we do; and finally in respect to their physical make-up 
they resemble our neighbors in our city block. Moreover in all these 
respects they differ among themselves much as the members of our 
club differ among themselves.”’ 
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The case histories included are for the larger part taken from the 
writings of Healy and Bronner and Clifford Shaw. Throughout the 
book the social genesis of criminal behavior is stressed, though a 
consideration of other relevant anthropological, biological and psy- 
chological factors are not omitted. In his discussions of personality 
and other relevant psychological contributions, Dr. Gault is not, nor 
apparently does he attempt to be, exhaustive or profound. Seemingly 
he aims to be and is relevant, intelligible and sane. His discussion of 
personality types, for example, is limited to a relatively simple discus- 
sion of introversion, extroversion and ambiversion. Critical students 
of testing might call his manner of treating personality tests simple and 
uncritical. Psychoanalysts might even argue against the legitimacy 
of his including in one chapter called ‘‘ Dissociation and Allied Phenom- 
ena’’ such psychoanalytic concepts as repression, rationalization and 
conflict. In his treatment of the chapter on the influence of race he 
refers to the army results on the intelligence of recruits and comes to 
the conclusion that the chief differences among the chief races of 
Europe are usually stated in other terms than of intelligence, and then 
gives three descriptions of the Nordic, Mediterranean and Alpine 
races which few cultural anthropologists would subscribe to. In the 
light of recent events in Europe, the following quotation is likely to 
evoke a smile on the face of individuals who are well balanced, whether 
they are or are not anthropologists, and a frown from anthropologists 
who take themselves seriously: ‘‘The Mediterranean race is impulsive, 
clannish, emotionally expressive, and less relentless in pursuit of a 
purpose than either the Nordics or the Alpines. They are personaally 
brave, and, unlike the Nordics, they are sensitive to affronts and capa- 
ble of personal cruelty.” 

Similar statements could without much difficulty be made about 
his treatment of other topics. But all in all, for the task he set him- 
self, the author has done a good job of selecting materials, presenting 
them in a very clear and telling manner. 

Part II, called the “Struggle against Crime,” is devoted to a 
consideration of institutional and extra-institutional treatment of 
criminals, methods for obtaining evidence, psychological as well as 
legal; court procedure; and a final chapter on the prevention of crimin- 
ality. The appendix contains a list of topics presented at the United 
States Training School for Prison Officers in New York City; programs 


for criminologic research institutes; and a plan for a crime prevention 
bureau. 
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In his consideration of treatment and therapy he emphasizes the 
influence of social organization on criminality; the need for applying 
facts of learning in treating human beings, even though they happen 
to be labelled criminals; and emphasizes the fact that attitudes even 
after once grown can be modified “by the expedient of altering the 
situations in which they have been formed.” 

In general terms the author favors what he thinks represents the 
most progressive penological thought, which is, ‘‘the development 
of a progressive graduated system within the prison that affords 
larger and larger degrees of responsibility and freedom of action leading 
up to the complete responsibility that is involved in discharge.”” The 
use of mental hygiene clinics in the preventive program is described in 
his final chapter. H. MELTZER. 

Psychological Service Center, St. Louis. 


Final Report of the Commission on Medical Education. New York: 
Office of the Director of Study, 1932, p. 560. 


In 1925 the Association of American Medical Colleges organized a 
Commission of Medical Education to study educational principles 
in the light of present trends in education and social evolution. This 
volume is the final report of this commission. The major portion 
of the report is concerned with a discussion of questions of a medical 
course, including such aspects as postgraduate medical education, 
internship, medical licensure, premedical education and cost of 
medical education. Adequate consideration, however, is also given to 
the social aspects of medicine. This is done in three chapters on 
the public aspects of medicine, medical needs and the supply and 
distribution of physicians. The social point of view prevails and 
serves as a basis for interpretation in discussing all the other topics. 

Typical of attitudes which serve as a basis for the interpretation of 
material presented in this book are the following: Approval of medical 
planning for community needs as a whole rather than on the ability of 
individuals to pay; a geographic distribution of physicians in terms of 
social needs, rather than a concentration in the larger cities; the selec- 
tion of medical students on the basis of personality, intellectual ability, 
scholastic achievement, industry, general culture and evidence of a 
grasp of the principles of the underlying sciences of chemistry, physics, 
and biology, upon which much of medical study is dependent, rather 
than upon specific time, course and subject requirements; general 
premedical education rather than preprofessional. H. MELTzeEr. 
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CuAIRETTE P. ARMSTRONG. 660 Runaway Boys. Boston: The 
Gorham Press, pp. 208. 


Running away from home is an offense that is reported to occur very 
frequently in all studies of delinquency, particularly in the studies of 
Healy and Bronner and of Slawson. In 660 Runaway Boys Dr. 
Clairette P. Armstrong attempts to answer the question: Why do boys 
desert their homes? on the basis of a statistical analysis of 660 runaway 
boys in the Children’s Court of New York City. Her investigation, 
then, is concerned with delinquent boys. Runaways held in court as 
neglected children under improper guardianship charges were not 
included in the study. The control groups used in this study, by and 
large, would not stand a critical analysis as adequate for comparative 
purposes from which generalizations are to be made about differences 
between runaway and other children. 

The major factors studied are chronological age, level of intelligence, 
educational achievement, economic level, marital status of parents, 
size of family, physical and nervous habits, concomitant offenses, insti- 
tutional experiences and companions. The specific aspects discussed 
include number and length of desertions, seasonal influences, institu- 
tions deserted and the boys’ reasons for deserting. 

Typical findings are: Runaways are reliably younger—by five 
months—than all the other delinquent boys before the court in 1929; 
the runaways are not reliably different from incorrigible boys and those 
held for unlawful entry in either verbal intelligence tests or performance 
tests; the majority of runaways are first-generation-American-born 
and have more than their share of foreign-born parents; the mothers 
of runaways are more often employed than are mothers of other public 
school children. 

In the case studies in the chapter called ‘‘ Cases in Point”’ the author 
loses herself in a discussion of test results, thus obscuring the picture 
of the dynamics of the personality discussed. Her discussion of the 
questions involved in dealing with runaways indicates a healthy- 
minded social outlook, but somehow one does not get the feeling that 
this has very much to do with the study made. The conclusions 
referred to and the pleas made for the abolition of poverty, ignorance 
and misfortune could just as easily have been made without the study. 
But there is some research usefulness in the many tables presenting 
the findings and some social usefulness in the pleas made. 

H. MELTZER. 
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